The Analysis of 
Variance 


THE ANALYSIS OF 
VARIANCE 


Fixed, Random and 
Mixed Models 


Hardeo Sahai 
Mohammed I. Ageel 


Springer Science+Business Media, LLC 


Hardeo Sahai Mohammed I. Ageel 


University of Puerto Rico King Saud University 
San Juan, Puerto Rico Abha Campus, Abha 
USA : Saudi Arabia 


University of Veracruz 
Xalapa, Veracruz 
Mexico 


Library of Congress Cataloging-in-Publication Data 


Sahai, Hardeo. 
The analysis of variance : fixed, random and mixed models / 
Hardeo Sahai and Mohammed I. Ageel. 


p. cm. 
Includes bibliographical references. 
ISBN 978-1-4612-7104-8 ISBN 978-1-4612-1344-4 (eBook) 


DOI 10.1007/978-1-4612-1344-4 
1. Analysis of variance. I. Ageel, Mohammed I., 1959- 


II. Title. 
QA279,822 2000 
519.538 —DC21 98-2788 


CIP 


AMS Subject Classifications: 62H, 62J 


; . ® 
Printed on acid-free paper. ip 
© 2000 Springer Science+Business Media New York 


Originally published by Birkhauser Boston in 2000 
Softcover reprint of the hardcover Ist edition 2000 


This book contains information obtained from authentic and highly regarded sources. Reprinted 
material is quoted with permission, and sources are indicated. A wide variety of references are listed. 
Reasonable efforts have been made to publish reliable data and information, but the author and the 
publisher cannot assume responsibility for the validity of all materials or for the consequences of 
their use. 


All rights reserved. This work may not be translated or copied in whole or in part without the 
written permission of the publisher Springer Science+Business Media, LLC. 

except for brief excerpts in connection with reviews or 

scholarly analysis. Use in connection with any form of information storage and retrieval, electronic 
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter 
developed is forbidden. 

The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the 
former are not especially identified, is not to be taken as a sign that such names, as understood by 
the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. 


ISBN 978-1-4612-7104-8 


Formatted from the authors’ Microsoft Word files. 


987654321 


To My Children: Amogh, Mrisa, Pankaj 
and 
to the Memory of My Father and Mother 


H.S. 


To My Family: Lyla, Ibrahim, Yagoub, Khalid and their mother 
and 
to My Father and Mother who contributed to this book by teaching me 
that hard work can be satisfying if the task 1s worthwhile 


M.LA. 


“What model are you considering?” “I am not considering one — I am using analysis 
of variance.” 


N. R. Draper and H. Smith 


The analysis of variance is (not a mathematical theorem but) a simple method of arrang- 
ing arithmetical facts so as to isolate and display the essential features of a body of data 
with the utmost simplicity. 


Sir Ronald A. Fisher 


No aphorism is more frequently repeated in connection with field trials, than that we 
must ask Nature few questions or, ideally, one question, at atime. The writer is convinced 
that this view is wholly mistaken. Nature, he suggests, will best respond to a logical 
and carefully thought out questionnaire; indeed, if we ask her a single question, she will 
often refuse to answer until some other topic has been discussed. 


Sir Ronald A. Fisher 


The statistician is no longer an alchemist expected to produce gold from any worthless 
material offered him. He is more like a chemist capable of assaying exactly how much 
of value it contains, and capable also of extracting this amount, and no more. In these 
circumstances, it would be foolish to commend a statistician because his results are 
precise or to reprove because they are not. If he is competent in his craft, the value of 
the result follows solely from the value of the material given him. It contains so much 
information and no more. His job is only to produce what it contains. 


Sir Ronald A. Fisher 


The new methods occupy an altogether higher plane than that in which ordinary statistics 
and simple averages move and have their being. Unfortunately, the ideas of which they 
treat, and still more the many technical phrases employed in them, are as yet unfamiliar. 
The arithmetic they require is laborious, and the mathematical investigations on which 
the arithmetic rests are difficult reading even for experts . . . this new departure in science 
makes its appearance under conditions that are unfavourable to its speedy recognition, 
and those who labour in it must abide for some time in patience before they can receive 
sympathy from the outside world. 


Sir Francis F. Galton 


When statistical data are collected as natural observations, the most sensible assumptions 
about the relevant statistical model have to be inserted. In controlled experimentation, 
however, randomness could be introduced deliberately into the design, so that any sys- 
tematic variability other than [that] due to imposed treatments could be eliminated. 


The second principle Fisher introduced naturally went with the first. With statistical 
analysis geared to the design, all variability not ascribed to the influence of treatments 
did not have to inflate the random error. With equal numbers of replications for the 
treatment, each replication could be contained in a distinct block, and only variability 
among plots in the same block were a source of error — that between blocks could be 
removed. 


Sir Maurice S. Bartlett 


Preface 


The analysis of variance (ANOVA) models have become one of the most widely 
used tools of modern statistics for analyzing multifactor data. The ANOVA 
models provide versatile statistical tools for studying the relationship between 
a dependent variable and one or more independent variables. The ANOVA mod- 
els are employed to determine whether different variables interact and which 
factors or factor combinations are most important. They are appealing because 
they provide a conceptually simple technique for investigating statistical rela- 
tionships among different independent variables known as factors. 

Currently there are several texts and monographs available on the sub- 
ject. However, some of them such as those of Scheffé (1959) and Fisher and 
McDonald (1978), are written for mathematically advanced readers, requiring 
a good background in calculus, matrix algebra, and statistical theory; whereas 
others such as Guenther (1964), Huitson (1971), and Dunn and Clark (1987), 
although they assume only a background in elementary algebra and statistics, 
treat the subject somewhat scantily and provide only a superficial discussion of 
the random and mixed effects analysis of variance. 

This book has been designed to bridge this gap. It provides a thorough and 
elementary discussion of the commonly employed analysis of variance mod- 
els with emphasis on intelligent applications of the methods. We have tried 
to present a logical development of the subject covering both the assump- 
tions made and methodological and computational details of the techniques 
involved. Most of the important results on estimation and hypothesis test- 
ing related to analysis of variance models have been included. An attempt 
has been made to present as many necessary concepts, principles, and tech- 
niques as possible without resorting to the use of advanced mathematics and 
statistical theory. In addition, the book contains complete citations of most 
of the important and related works, including an up-to-date and comprehen- 
sive bibliography of the field. A notable feature of the presentation is that the 
fixed, random, and mixed effect analysis of variance models are treated in 
tandem. 

No attempt has been made to present the theoretical derivations of the results 
and techniques employed in the text. In such cases the reader is referred to the 
appropriate sources where such discussions can be found. It is hoped that the 
inquisitive reader will go through these sources to get a thorough grounding in 
the theory involved. However, whenever considered appropriate, some elemen- 
tary derivations involving results on expectations of mean squares have been 
included. 


Xx Preface 


The computational formulae needed to perform analysis of variance calcula- 
tion are presented in full detail in order to facilitate the requisite computations 
using a handheld scientific calculator. Many modern electronic calculators are 
sufficiently powerful to handle complex arithmetic and algebraic computations 
and can be readily employed for this task. We are of the opinion that researchers 
and scientists should clearly understand a procedure before using it and manual 
computations provide a better understanding of the working of a procedure as 
well as any limitations of the experimental data. In addition, a separate chapter 
has been included to describe the use of some well-known statistical packages 
to perform a computer-assisted analysis of variance and specific commands rel- 
evant to individual analysis of variance models are included. Each chapter con- 
tains a number of worked examples where both manual and computer-assisted 
analysis of variance computations are illustrated in complete detail. 

The only prerequisite for the understanding of the material is a preparation in 
a precalculus introductory course in statistical inference with special emphasis 
on the principles of estimation and hypothesis testing. Although some of the 
statistical concepts and principles used in the text (e.g., maximum likelihood 
and minimum variance unbiased estimation, and other related concepts and 
principles) may not be familiar to students who have not taken any intermediate 
and advanced level courses in statistical inference, these concepts have been 
included for the sake of completeness and to enhance the reference value of 
the text. However, the use of such results is generally incidental, without any 
mathematical and technical formality, and can be skipped without any loss 
in continuity. In addition, most of the results of this nature are usually kept 
out of the main body of the text and are generally indicated under a remark 
or following a footnote. Moreover, many important results in probability and 
statistics useful for understanding the analysis of variance models have been 
included in the appendices. 

The book can be employed as a textbook in an introductory analysis of 
variance course for students whose specialization is not statistics, such as those 
in biological, social, engineering, and management sciences but nevertheless 
use analysis of variance quite extensively in their work. It can also be used as 
a supplement for a theoretical course in analysis of variance to balance and 
complement the theoretical aspects of the subject with practical applications. 
The book contains ample discussion and references to many theoretical results 
and will be immensely useful to students with advanced training in statistics. 
The investigators concerned with the analysis of variance techniques of data 
analysis will also find it a useful source of reference for many important results 
and applications. 

Inasmuch as the principles and procedures of the analysis of variance are 
fairly general and common to all academic disciplines, the book can be em- 
ployed as a text in all curricula. Although the examples and exercises are 
drawn primarily from behavioral, biological, engineering, and management 
sciences, the illustrations and interpretations are relevant to all disciplines. This 


Preface xi 


underscores the interdisciplinary nature of research problems from all substan- 
tive fields and the absolute generality and discipline-free nature of statistical 
theory and methodology. Examples and exercises, in addition to assessing basic 
definitional and computational skills, are designed to illustrate basic conceptual 
understanding including applications and interpretations. 

The textbook contains an abundance of footnotes and remarks. They are 
intended for statistically sophisticated readers who wish to pursue the subject 
matter in greater depth and it is not necessary that the student beginning an 
analysis of variance course read them. They often expand and elaborate on 
a particular theme, point the way to generalization and to other techniques, 
and make historical comments and remarks. In addition, they contain literature 
citations for further exploration of the topic and refer to finer points of theory and 
methods. We are confident that this approach will be pedagogically appealing 
and useful to readers with a higher degree of scholarly interest. 

Finally, there is something to be said concerning the use of real-life data. We 
certainly believe that real-life examples are important as motivational devices 
for students to convince them that the techniques are indeed used in substantive 
fields of research; and, whenever possible, we have tried to make use of data 
from actual experiments and studies reported in books and papers by other 
authors. However, real-life data that are easy to describe, without requiring too 
much time and space, and are helpful in illustrating a particular technique are not 
always easy to find. Thus, we have included many examples and exercises that 
are realistically constructed using hypothetical data in order to fit the illustration 
of a particular statistical model under consideration. We believe that in many 
instances the use of such “artificial” data is just as instructive and motivating as 
the “‘real’’ data. The reader interested in working through some more examples 
and exercises involving real-data sets is referred to the books by Andrews and 
Herzberg (1985) and Hand et al. (1993). 


Acknowledgments 


No book is ever written in isolation and we would like to express our appreciation 
and gratitude to all the individuals and organizations who directly or indirectly 
have contributed to this book. 

The first author wants to thank Professor Richard L. Anderson for introducing 
him to analysis of variance models and encouraging him to do further work in 
this field. He would also like to acknowledge two sabbatical leaves (1978— 
1979 and 1993-1994) granted by the Administrative Board of the University 
of Puerto Rico which provided the time to write this book. 

Over a period of ten years (1972-1982), the first author taught a course in 
analysis of variance at the University of Puerto Rico. It was taken primarily 
by students in substantive fields such as biology, business, psychology, and 
engineering, among others. An initial set of lectures was developed during this 
period and although this book drastically expands and updates those notes, it 
never would have been possible to embark upon a project of this magnitude 
without that groundwork. 

Parts of the manuscript were used as lecture notes during a short course on 
analysis of variance offered in the autumn of 1993 at the Division of Biostatis- 
tics of the University of Granada (Spain) Medical and Dental Faculties and we 
greatly benefited from the comments and criticisms received. We would partic- 
ularly like to thank Professor Antonio Martin Andrés for an invitation to offer 
the course. 

Several portions of the manuscript were written and revised during the first 
author’s successive visits (Spring 1994, Summer 1995, Winter 1996, Autumn 
1997, Summer 1998) to the University of Veracruz (Mexico) and he is espe- 
cially grateful to Dr. Mario Miguel Ojeda, and other University authorities for 
organizing and hosting the visits and providing a stimulating environment for 
research and study. 

A preliminary draft of the manuscript was used as a principal reference source 
in a course on analysis of variance offered at the Department of Mathematics 
and Statistics of the National University of Colombia in Santafé de Bogota 
during the summer of 1994 and we received much useful feedback. We would 
like to thank Dr. Jorge Martinez for organizing the visit and an invitation to 
offer the course. 

Several parts of the manuscript were used as primary reference materials 
for a short course on analysis of variance offered at the National University 
of Trujillo (Peru) and we greatly benefited from suggestions that improved the 


xiii 


Xiv Acknowledgments 


text. We are especially indebted to Professor Segundo M. Chuquilin Teran and 
other University authorities for an invitation to offer the course. 

Selected portions of the manuscript were also used during a short course 
on analysis of variance offered at the Fifth Meeting of the International Bio- 
metric Society Network Meeting for Central America, Mexico, Colombia, and 
Venezuela, held in Xalapa, Mexico in August, 1997, and we received many 
useful ideas from the participants that included students and professors from 
substantive fields of research and a diverse group of professionals from industry 
and government. 

The authors would like to thank all the individuals who have participated in 
our analysis of variance courses, both on and off campus and in a variety of 
forums and settings, and have worked through “preprinted” versions of the text 
for their kind comments and sufferance. Undoubtedly, all their suggestions and 
remarks have had a positive effect on the book. We are also greatly indebted to 
authors of textbooks and journal articles which we have generously consulted 
and who are cited throughout the book. 

We have received several excellent comments and suggestions on various 
parts of the book from many of our friends and colleagues. We would espe- 
cially like to mention our appreciation to Dr. Shahariar Huda of the King Saud 
University (Saudi Arabia), Dr. Mario M. Ojeda of the University of Veracruz 
(Mexico), and Dr. Satish C. Misra of the Food and Drug Administration (USA) 
who have commented upon various parts of the manuscript. 

Of course, we did not address all the comments and suggestions received and 
any remaining errors are the authors’ sole responsibility. 

We would also like to record a note of appreciation to Dr. Russel D. Wolfinger 
of the SAS Institute, Inc. (USA) and to Dr. David Nichols of SPSS, Inc. (USA) 
for reviewing contents of the manuscript on performing analysis of variance 
using statistical packages and pointing out some omissions and inaccuracies. 

Dr. Raul Micchiavelli of the University of Puerto Rico, Mr. Guadalupe 
Hernandez Lira of the University of Veracruz (Mexico), and Mr. Victor Alvarez 
of the University of Guatemala in Guatemala City assisted us in running worked 
examples using statistical packages and their helpful support 1s greatly appre- 
ciated. 

Dr. Jayanta Banarjee, Professor of Mechanical Engineering at the University 
of Puerto Rico, deserves special recognition for kindly helping us to construct 
a number of realistic exercises using hypothetical data. 

The first author wishes to extend a warm appreciation to members and staff 
of the Puerto Rico Center for Addiction Research, especially Dr. Rafaela R. 
Robles, Dr. Hector M. Colé6n, Ms. Carmen A. Marrero, M. P. H., Mr. Tomas L. 
Matos, and Mr. Juan C. Reyes, M. P. H. who as an innovative research group, 
for well over a decade provided an intellectually stimulating environment and 
a lively research forum to discuss and debate the role of analysis of variance 
models in social and behavioral research. 

Our grateful and special thanks go to our publisher, especially Mr. Wayne 
Yuhasz, Executive Director of Computational Sciences and Engineering, for 


Acknowledgments XV 


his encouragement and support of the project. Equally, we would like to record 
our thanks to the editorial and production staff at Birkhauser, especially 
Mr. Michael Koy, Production Editor, for all their help and cooperation in bring- 
ing the project to its fruition. 

The authors and Birkhauser would like to thank many authors, publishers, 
and other organizations for their kind permission to use the data and to reprint 
whole or parts of statistical tables and charts from their previously published 
copyrighted materials, and the acknowledgments are made in the book where 
they appear. 

Finally, we must make a special acknowledgment of gratitude to our families, 
who were patient during the many hours of daily work devoted to the book, in 
what seemed like an endless process of revisions for finalizing the manuscript, 
and we are greatly indebted for their continued help and support. 

The authors welcome any suggestions and criticisms of the book in regard 
to omissions, inaccuracies, corrections, additions, or ways of presentation that 
would be rectified in any further revision of this work. 


Contents 


Preiace.! 645 ba os Rhee ee RO ea Ode Ge ees ee Seer oe ix 
Acknowledgments... ..........0. 0.20. eee ee ee xill 
ist-or- Tables'< 2 4.5 @unsso/eo.4e wa bee eee Gas Be Oates Xxix 
ListorPisutes’ 2403 chen bu bebe Bbw ba Phe Reed bese ebas XXXili 
1. Antroduchon ... on. 6 «nove be eortiae f CAS Ree Ree BETES RE 1 
DO PICViCW t.36.% 2h es Ga ee Rew ee ee eet Bee eh Se ee 1 
1.1 Historical Developments......................... 3 
1.2 Analysis of Variance Models ...............-...--.4 4 
1.3. Concept of Fixed and Random Effects................. 6 
1.4 Finite and Infinite Populations ..................... 7 
1.5 General and Generalized Linear Models. ............... 8 
L.6°- -Scope or the Book. (44.5 vaso wat 4b ea se eee BI Ee SS 8 
2. One-Way Classification ........0.0.0.0... 02.0000 ee eee 1] 
DO “ETC VIOW <8 tad: Geter el SI De ee ae aioe. Adie. doe Sars 1] 
2.1 Mathematical Model ..................22-0-004- 1] 
2.2 Assumptions of the Model...................002. 1] 
2.3 Partition of the Total Sum of Squares ................ 14 
2.4 The Concept of Degrees of Freedom................. 15 
2.5 Mean Squares and Their Expectations................ 17 
2.6 Sampling Distribution of Mean Squares............... 20 
2.7 Test of Hypothesis: The Analysis of Variance F Test....... 22 
Model I (Fixed Effects) ..........0..........208. 22 
Model II (Random Effects) ...............0....... 24 
2.8 Analysis of Variance Table ..................0--. 26 

2.9 Point Estimation: Estimation of Treatment Effects and 
Vahiance COMpOnens: 24s 5 Ge dee oa ee Be oe are SS 26 
2.10 Confidence Intervals for Variance Components .......... 31 
2.11 Computational Formulae and Procedure. .............. 35 
2.12 Analysis of Variance for Unequal Number of Observations ... 36 
2.13 Worked Examples for ModelI .................... 39 
2.14 Worked Examples for Model lII.................... 43 
2.15 Use of Statistical Computing Packages ............... 52 
2.16 Worked Examples Using Statistical Packages ........... 52 
2.17 Power of the Analysis of Variance F Test.............. 52 
Model I (Fixed Effects) ..................2.2000. 57 
Model II (Random Effects) ........0..........0..0.. 60 


XVIil 


Contents 


2.18 Power and Determination of Sample Size .............. 


Sample Size Determination Using Smallest 
Detectable -Diflcrence. <2 sasha OS a eee H SE eR 


2.19 Inference About the Difference Between Treatment Means: 


Multiple Comparisons ..............-.2..-2 22008. 

Linear Combination of Means, Contrast and 

Orthogonal Contrasts... ....... 2... 00002 

Test of Hypothesis Involving a Contrast............... 

The Use of Multiple Comparisons .................. 
Tukey 'S Method? a... 4-5 4 ew al Se Se we es 
Schetfe Ss iethod. 2.) wdac% a ee Sad ere eee ee et 
Interpretation of Tukey’s and Scheffé’s methods ..... 
Comparison of Tukey’s and Scheffé’s methods ...... 

Other Multiple Comparison Methods ................ 
Least significant difference test ............... 
DOMICMONIS tOSb 5.5.10.05 & 9628. & Bd ee ode Sa ee es eS 
Dunn-Sidak’s test... 0.0.0.0. 0000. ee ee 
Newman-Keuls’s test .............2...2-00-4 
Duncan’s multiple range test... ...........--. 
DNS 1ESt3. parce ee onde Oe a ee ER ee ek 

Multiple Comparisons for Unequal Sample Sizes 

and: VallanCes: oxic ss, 4.6 ob he Aes kOe eS eS Ane eee 
Unequal sample sizes ................0-04.4 
Unequal population variances ................ 


2.20 Effects of Departures from Assumptions Underlying the 


22 


any 


Analysis of Variance Model. ................220004 
Departures from Normality ..................0000.% 
Departures from Equal Variances................--- 
Departures from Independence of Error Terms ........... 
Tests for Departures from Assumptions of the Model ....... 
Tests for Normality .........0.0...0 2.2... 0.0.0. 000048 
Chi-square goodness-of-fit test... ............. 
Vest-for skewness. 2.506445 e204 Sande ere eGR 
TESCIOE KUDLOSIS~ 6: 4-9-4. 3.06. Sora fw es A ode GAR Be eH 
Other Tests for Normality .................0.00008.% 
Shapiro-Wilk’s W test... ........0....22200. 
Shapiro-Francia’s test... 2... ee ee 
D’Agostino’s Dtest.. 2... . ee ee 
Tests for Homoscedasticity .................0008. 
PS ARLICUS (OSU og cae coe he ithe woe os dey Ge ew eS we oe acess 
FlAmley S1CSU ore ok ees oh ERAS ee eS 
COehran SWESEe i256 gence ee GP ial Sa ee Se 
Comments on Bartlett’s, Hartley’s and Cochran’s tests 
Other tests of homoscedasticity .............. 


Contents 


2.22 Corrections for Departures from Assumptions of the Model . . 
Transformations to Correct Lack of Normality .......... 
Logarithmic transformation ................ 

Square-root transformation. ..............-.. 

Arcsine transformation .................4.. 
Transformations to Correct Lack of Homoscedasticity ..... 
Logarithmic transformation ................ 

Square-root transformation ................. 

Reciprocal transformation ...............4.. 

Arcsine transformation .................2. 

Square transformation. ...............004. 

Power transformation .................20.4. 

EXClCISOS. 4.43) 3:4 64-344 DU ORE AREER EES EOS Ot ae es 


3. Two-Way Crossed Classification Without Interaction ....... 
3.0. PrevVieW esa caret eee Ba OEE es Sa e eS S53 
3.1 Mathematical Model ..........0.-. 0.2.2.2. 0 52 eee 
3.2 Assumptions of the Model.................020-. 
3.3 Partition of the Total Sum of Squares ............... 
3.4 Mean Squares and Their Expectations. .............. 

Model I (Fixed Effects) ..............2.2.2.0.2006. 
Model II (Random Effects) .................2004. 
Model III (Mixed Effects) ..............-200005. 
3.5 Sampling Distribution of Mean Squares.............. 
Model I (Fixed Effects) ............-....--2.-226. 
Model II (Random Effects) ..................... 
Model III (Mixed Effects) .....................2. 
3.6 Tests of Hypotheses: The Analysis of Variance F Tests 
Model I (Fixed Effects) ..............2....2005. 
Model II (Random Effects) ................-.554. 
Model III (Mixed Effects) .............2...-2-206. 
647 ‘Pomnt-EStimatiOn:<<« 22 % fn. 22:3 wore GF ore oe) Ao a 
Model I (Fixed Effects) ............2.2.2.-02-. 0006. 
Model II (Random Effects) .................044. 
Model III (Mixed Effects) ................0..0004. 
3.8 Interval Estimation ............ 2.2.2... 0 0 eee ee 
Model I (Fixed Effects) .............2.2.0--000068. 
Model II (Random Effects) .................206. 
Model III (Mixed Effects) ............2.2.0220006. 
3.9 Computational Formulae and Procedure. ............. 
3.10 Missing Observations... ....... 0... 0005 eee eae 
3.11 Power of the Analysis of Variance F Tests ............ 
3.12 Multiple Comparison Methods ................... 
3.13 Worked Example for ModelI .................... 


xix 


XX Contents 
3.14 Worked Example for Model II ................... 154 
3.15 Worked Example for Model III ................... 157 
3.16 Worked Example for Missing Value Analysis .......... 161 
3.17 Use of Statistical Computing Packages .............. 164 
3.18 Worked Examples Using Statistical Packages .......... 164 
3.19 Effects of Violations of Assumptions of the Model ....... 168 
PXCICISCS* 24 een, fh tiesto Ge Bete ae Aen ee ee a Get ete 168 

4. Two-Way Crossed Classification With Interaction ........ 177 
A> “PTC VICW  & ie. 6 ve chsh eg tod. Ae Bree EA Os we, HE eda ah 177 
4.1 Mathematical Model ....................2.02. 177 
4.2 Assumptions of the Model ............-.....0.0.. 180 
4.3 Partition of the Total Sum of Squares............... 182 
4.4 Mean Squares and Their Expectations .............. 184 

Model I (Fixed Effects)... ........ 2.000000 eee 186 
Model II (Random Effects). .................... 188 
Model III (Mixed Effects) ..................0.. 190 

4.5 Sampling Distribution of Mean Squares ............. 193 
Model I (Fixed Effects)... .......... 0.00000 ae 193 
Model II (Random Effects)..................0.. 196 
Model III (Mixed Effects) ............0......0.. 196 

4.6 Tests of Hypotheses: The Analysis of Variance F Tests... . 197 
Model I (Fixed Effects)... ......0.... 0.0.0.0 2008 8 198 

Test for AB interactions ...............200.4 198 

Test for factor B effects...............2.00. 198 

Test for factor A effects .............0. 22005 199 

Model II (Random Effects)... .. op ND Geeta See ane tes er ws 2 200 

Test for AB interactions .................4.. 200 

Test for factor B effects. .................. 201 

Test for factor A effects ...........0.....4.. 201 

Model III (Mixed Effects) .............0........ 202 

Test for AB interactions ................0.. 202 

Test for factor B effects. ...............04. 202 

Test for factor A effects. ..............00.. 202 

Summary of Models and Tests........... ee eee 203 

AS] -Point EStinatiOn .4°%.4-4 oA wae SEO eee ee 203 
Model I (Fixed Effects). ..........0..0.......4.. 203 
Model II (Random Effects). .................0... 205 
Model III (Mixed Effects) ..................0... 206 

4.8 Interval Estimation ............ 2.0.2... 000 eee 207 
Model I (Fixed Effects). ...............0.2 02048. 207 
Model II (Random Effects)..................... 208 
Model III (Mixed Effects) ..................0.. 209 


4.9 Computational Formulae and Procedure ............. 210 


Contents XXxi 


4.10 Analysis of Variance with Unequal Sample Sizes Per Cell . . 212 


Fixed Effects:Analysis:.s.2 4s wa ee ede a thoes Ce4Se 215 

Proportional frequencies .................4. 215 

General case of unequal frequencies ........... 217 

Random Effects Analysis..............-.-.220005 223 

Proportional frequencies .................. 223 

General case of unequal frequencies ........... 223 

Mixed Effects Analysis..............0-.-22000- 226 

4.11 Power of the Analysis of Variance F Tests ........... 227 

Model: Chixed Eiects): 4 2o-..d% 20808 & Bde ele we aa 227 

Test for AB interactions ...............-.--. 227 

Test for factor B effects. ...............-.. 22) 

Test for factor A effects. ...............0-. 227 

Model II (Random Effects). .................... 228 

Test for AB interactions ..............-..--- 228 

Test for factor. Bb eects 24% 42 24a640 92 Balog es 228 

Test for factor A effects. ..............00.. 228 

Model III (Mixed Effects) .................00.. 229 

Test for AB interactions ...............-.-.. 229 

Test for factor B effects. .................. 229 

Test for factor A effects. ...............0-. 229 

4.12 Multiple Comparison Methods .................. 230 

4.13 Worked Example for ModelI .................0.. 233 
4.14 Worked Example for Model I: Unequal Sample 

Sizes: Per Cel) 35-4.5% 6 oF hee SS OS Ree BESS a 237 

4.15 Worked Example for Model lI].................. . 244 

4.16 Worked Example for Model TIT .................. 247 

4.17 Use of Statistical Computing Packages. ............. 252 

4.18 Worked Examples Using Statistical Packages. ......... 253 

4.19 The Meaning and Interpretation of Interaction ......... 253 

4.20 Interaction With One Observation Per Cell ........... 259 

4.21 Alternate Mixed Models ...................--.- 264 

4.22 Effects of Violations of Assumptions of the Model ...... 268 

Model I (Fixed Effects)... .......0.........0000. 269 

Model II (Random Effects). .................... 269 

Model III (Mixed Effects) ...................8. 270 

I XGICISCSis. 2 et ts Soe Sik & eh ee ee ee SS 270 

5. Three-Way and Higher-Order Crossed Classifications ..... . 281 

SA? SPRCVIOW™ gy nh 6 aa ok si ek ae GS Sree ae eS eee a 281 

5.1 Mathematical Model .................-2-.-204- 281 

5.2 Assumptions of the Model ..................... 284 

5.3 Partition of the Total Sum of Squares............... 285 


5.4 Mean Squares and Their Expectations .............. 286 


XXii 


Contents 
5.5 Tests of Hypotheses: The Analysis of Variance F Tests... . 286 
Model I (Fixed Effects)... .........0202.2.2..0.00... 288 
Model II (Random Effects). .................... 288 
Model III (Mixed Effects) ..................... 291 
5.6 Point and Interval Estimation ................... 292 
5.7 Computational Formulae and Procedure............. 297 
5.8 Power of the Analysis of Variance F Tests ........... 298 
5.9 Multiple Comparison Methods .................. 299 
5.10 Three-Way Classification with One Observation Per Cell... 301 
5.11 Four-Way Crossed Classification ................. 302 
5.12 Higher-Order Crossed Classifications .............. 307 
5.13 Unequal Sample Sizes in Three- and Higher-Order 
ClassifiCaliOns s.o:3-: au Sees PE a ee he a ek SL ed 311 
5.14 Worked Example for ModelI ................... 314 
5.15 Worked Example for Model II................... 322 
5.16 Worked Example for Model III .................. 328 
5.17 Use of Statistical Computing Packages.............. 333 
5.18 Worked Examples Using Statistical Packages.......... 334 
EXCICISCS 2s. o! de Leman ts, Sth) aoe, WS ee Be Ee eS 338 
Two-Way Nested (Hierarchical) Classification ........... 347 
G0 TPICVICW 4 Gat oe oOo eed, 5.9 hee ees 347 
6.1 Mathematical Model ......................2.. 349 
6.2 Assumptions of the Model ..................2.0.8. 350 
6.3 Analysis of Variance ...............22 220 eee 350 
6.4 Tests of Hypotheses: The Analysis of Variance F Tests... . 351 
6.5 Point Estimation . i. 4 ccce ec eh he ee eee ea ee 354 
Model I (Fixed Effects)... .......0..........0.0.. 354 
Model II (Random Effects). .................... 355 
Model III (Mixed Effects) .................0.0.0.. 356 
6.6 Interval Estimation .............. 0.000 ee eee 356 
Model I (Fixed Effects). ........0..........0.2.20. 356 
Model II (Random Effects)... ................... 358 
Model III (Mixed Effects) ..................... 359 
6.7 Computational Formulae and Procedure ............. 359 
6.8 Power of the Analysis of Variance F Tests ........... 360 
6.9 Multiple Comparison Methods .................. 360 
6.10 Unequal Numbers in the Subclasses ............... 362 
Tests:Of Hypomeses: 2.) n.o2% 22% oe Babb og eee Gee A 363 
Point and Interval Estimation ................... 365 
6.11 Worked Example for Modell ................... 368 
6.12 Worked Example for Model T................... 371 
6.13 Worked Example for Model II: Unequal Numbers 
IN The SUDCIASSCS® so ccc26 ses se chirane Soke ede hh We ES 374 


Contents XXill 


6.15 Use of Statistical Computing Packages.............. 381 
6.16 Worked Examples Using Statistical Packages. ......... 381 
PXCICISES ou) &. Gibselie. 3, bes aes he Se eee ES eee eS 386 
7. Three-Way and Higher-Order Nested Classifications. ...... 395 
WO. JPICVICW 54 5. Siete, de BSG, 2 ih Mk, Be as Soe dy he 395 
7.1 Mathematical Model .................-.220..- 395 
7.2 Analysis of Variance .............0.02. 20020 eee 396 
7.3. Tests of Hypotheses and Estimation ............... 399 
7.4 Unequal Numbers in the Subclasses ............... 400 
7.5 Four-Way Nested Classification. ................. 403 
7.6 General g-Way Nested Classification. .............. 406 
7.7 Worked Example for Model T................... 407 
7.8 Worked Example for Model II: Unequal Numbers 
Inthe SUbClaSses! 2 isc ge ate hee A ORS Se Se RS 411 
7.9 Worked Example for Model III .................. 417 
7.10 Use of Statistical Computing Packages.............. 421 
7.11 Worked Examples Using Statistical Packages. ......... 421 
PNET CISE Sey sec yest, Sives Sore cle Gao Oe Gr beet fe Se ae ete ee R Bete 425 
8. Partially Nested Classifications .................... 431 
B10) GEGC VIEW hho & sew, Sak needs es BEER Be S eee wee eS 431 
8.1 Mathematical Model ....................-.--- 431 
8.2 Analysis of Variance ............-.-.2.22-000- 433 
8.3. Computational Formulae and Procedure ............. 437 
8.4 A Four-Factor Partially Nested Classification. ......... 438 
8.5 Worked Example forModelll................... 439 
8.6 Worked Example for Model TI .................. 444 
8.7 Use of Statistical Computing Packages. ............. 448 
8.8 Worked Example Using Statistical Packages .......... 451 
PENG CISCS ee foe by toa bee eats wee Se, ein ed a ee Re 451 
9. Finite Population and Other Models ................. 461 
OO. RICVIEW: 2.8.8 to oe oh hh Sue Gre ers We Se eS a ee 461 
9.1 One-Way Finite Population Model ................ 461 
9.2 Two-Way Crossed Finite Population Model ........... 462 
TLEStS OF FLY POtheSeS..~5 fe. se aeo eh ew ee ea 465 
PO TCNG ais ie Bm a eS hag Gwe ee ie 8. SI 466 
Point Estimation... 2..-.2.604.4.a% Be4 44 oo eee HRs 467 
Interval Estimation ...........0.0. 2.000200 0 eae 468 
9.3. Three-Way Crossed Finite Population Model .......... 470 
9.4 Four-Way Crossed Finite Population Model. .......... 472 
9.5 Nested Finite Population Models ................. 474 
9.6 Unbalanced Finite Population Models .............. 475 


9.7 Worked Example for a Finite Population Model ........ 475 


XXIV 


10. 


Contents 

9.8 Other Models..................2. 0... 00004. 481 
9.9 Use of Statistical Computing Packages.............. 481 
PXCICISES 2. Loe dw Eb eee we ee SR CSREES 2 481 
Some Simple Experimental Designs ................. 483 
LOLO.PICViCW 20,00 Bk SS Gaal Bae ee ee eee Bie 483 
10.1 Principles of Experimental Design ................ 483 
REplicanOn 4 4.2% <Seeseeoe hae D4 e oee Mee 8 483 
RandomiZaon = 40-6: 4 avs SR ES el ee we ee es 483 
COMM 8, 2 Shee ty ee Gok Be eh ets ne Seti es oe ee 484 

10.2 Completely Randomized Design ................. 485 
Model and Analysis ................2.2-0 02200. 485 
WotkedExample: ern ce oe wisve wa a ee ee ee, ee 487 

10.3 Randomized Block Design. .................... 488 
Model and Analysis................2. 220020085 490 
Both blocks and treatments fixed ............. 490 

Both blocks and treatments random............ 492 

Blocks random and treatments fixed ........... 492 

Blocks fixed and treatments random ........... 493 

Missing Observations .............0.00 000008 493 
Relative Efficiency of the Design ................. 493 
REGHCANONS:: 524 425: tose eR Se ae & Be 494 
Worked Example. < 42.04 2vc44.% eo? OOS eee 2 AES 494 

10.4 Latin Square Design. ..................20004. 495 
Model and Analysis ................0 0002 eect 498 
Point and Interval Estimation ................... 500 
Power obec lest. 4.2 beak ee er ates @ Ae BEE key 501 
Multiple Comparisons .............2.0 0002s eee 501 
Computational Formulae ...................04. 502 
Missing Observations .............002 0002 eee 502 
Tests for Interaction. 2 gz, dw: a, Ok S08, Sow, e's ee wed 503 
Relative Efficiency of the Design ................. 503 
REplCanions:. : sock dees Sa RS Sik wee 504 
Worked Example: «<3 62.04 42-$s speek eee, Beas 505 

10.5 Graeco-Latin Square Design. ..................4. 507 
Model and Analysis... ........... 2.000002 eee 507 
Worked Example ............... Si guia tee Anca 510 

10.6 Split-Plot Design... .......0.............0004. 512 
Model and Analysis... ..............202. 200048. 513 
Worked Example: 2.2.0.:-2.04 de8 At 2224226 Se Bes 516 

LO 7 OUer DESIONS:. a.m 6 cee erence ce eo eiat ioe out Aedes ae NER Se Bos 516 
Incomplete Block Designs .................004. 516 
PSAUICCHIESIONS ven: 3 A eee Seed eae Beane Bale we eae a 519 
YoOuqge nM SqUales: 5:3: oct 5s wo dee te ocak ele ee mee eS 520 


Cross-Over Designs .......... 000 cee eee ee ee ee 520 


Contents 


Repeated Measures Designs .................0.. 
Hyper-Graeco-Latin and Hyper Squares ............. 
Magic and Super Magic Latin Squares.............. 
Split-Split-Plot Design ................-.. 00048. 
2? Design and Fractional Replications .............. 


10.8 Use of Statistical Computing Packages. ............. 
EXE CISESia, 02-052 Si, po Fee eh Se, @ Mee GR tee oe Ee ee 
11. Analysis of Variance Using Statistical Computing Packages 
POC PIeViCW:. 4.7 4.6, sec &. tueatenta oo Bd Be. Oe 
11.1 Analysis of Variance Using SAS ................. 
11.2 Analysis of Variance Using SPSS................. 
11.3 Analysis of Variance Using BMDP................ 
11.4 Use of Statistical Packages for Computing Power ....... 
11.5 Use of Statistical Packages for Multiple 
Comparison Procedures .............2.200 0000s 
11.6 Use of Statistical Packages for Tests of Homoscedasticity 
11.7 Use of Statistical Packages for Tests of Normality ....... 
Appendices : 
A Student’s¢ Distribution. ...................... 
B  Chi-Square Distribution. ...................--. 
C Sampling Distribution of (n — 1)S?/o* .. 0... 0.00. . 
DD Fe IDISt(HDUNOM: «ga dace, ooh ached nt ee Sh dv ee gig ee 
E  Noncentral Chi-Square Distribution. ............... 
F  Noncentral and Doubly Noncentral t Distributions ....... 
G  Noncentral F Distribution ..................... 
H Doubly Noncentral F Distribution ................ 
I Studentized Range Distribution .................. 
J. Studentized Maximum Modulus Distribution .......... 
K — Satterthwaite Procedure and Its Application to 
Analysis‘of Variance: «4 oe umes SoBe eed eb eeu KA 
L Components of Variance ...............-.22-0003% 
M_Intraclass Correlation ..................--2-20.. 
N_ Analysis of Covariance ...........0.2.2.0 000 ee eee 
O Equivalence of the Anova F and Two-Samplet Tests ..... 
P Equivalence of the Anova F and Pairedt Tests ......... 
Q Expected Value and Variance..................-.. 
R Covariance and Correlation. ................000.% 
S Rules for Determining the Analysis of Variance Model ... . 
T Rules for Calculating Sums of Squares 


and Degrees of Freedom .................-.---- 
Rules for Finding Expected Mean Squares............ 
Samples and Sampling Distribution. ............... 
Methods of Statistical Inference .................. 


ge 


XXV 


XXvi Contents 
X Some Selected Latin Squares.................... 601 
Y Some Selected Graeco-Latin Squares............... 602 
Z PROC MIXED Outputs for Some Selected 

Worked, Examples: .. 2.26.4. 20) 404 6 2 d-p gem eh ge es Gea ie 603 

Statistical Tables and Charts ....................2.00. 605 

Tables 
I Cumulative Standard Normal Distribution ....... 605 
II Percentage Points of the Standard Normal Distribution 607 
Ill Critical Values of the Student’s ¢ Distribution... ... 608 
IV Critical Values of the Chi-Square Distribution ..... 610 
V Critical Values of the F Distribution ........... 612 
VI Power of the Student’st Test ............... 618 
VII Power of the Analysis of Variance F Test........ 621 
VUl Power Values and Optimum Number of Levels 

for Total Number of Observations in the One-Way 

Random Effects Analysis of Variance F Test ..... 630 
IX Minimum Sample Size Per Treatment Group 

Needed for a Given Value of p, a, 1 — B and Effect 

Size (C) in Sigma Units ................4. 634 
x Critical Values of the Studentized Range Distribution . 636 
XI Critical Values of the Dunnett’s Test ........... 639 
XII Critical Values of the Duncan’s Multiple Range Test . 642 
XIII Critical Values of the Bonferroni ¢ Statistic and Dunn’s 

Multiple Comparison Test.................. 645 
XIV Critical Values of the Dunn-Siddk’s Multiple 

Comparison Test’ 2 =. ace k eb BoP ES SH ESS YS 648 
XV Critical Values of the Studentized Maximum 

Modulus Distribution... ...............204. 652 
XVI Critical Values of the Studentized Augmented Range 

DistnibulOn «2.4 aac xv bats oceke ao tad ake o4 654 
XVII(a) _— Critical Values of the Distribution of /, for Testing 

SKEW HES Siac tn Seek aoe On & Gy atte a ee ei oe i 655 
XVII(b) Critical Values of the Distribution of 7 for Testing 

IUELOSIS> io5Sucr he sG tS cee esa gee 2 AG tern ws 655 
XVIII Coefficients of Order Statistics for the Shapiro-Wilk’s 

W Testfor Normality .................0.-. 656 
XIX Critical Values of the Shapiro-Wilk’s W Test for 

INOEMIAULY~. 2.28 a. Sener Sok oh ones era a: Soot 657 
XX Critical Values of the D’ Agostino’s D Test for 

INOMMIANLY 4 6.2-hc0s SSS o8 eas. acoule beavers 658 
XXI Critical Values of the Bartlett’s Test for Homogeneity 

Ol VanianCes? «35 2. vac 4 KW ee BE BE Bre 659 


Contents XXVii 


XXII Critical Values of the Hartley’s Maximum F Ratio Test 
for Homogeneity of Variances ............... 663 
XXIII Critical Values of the Cochran’s C Test for Homogeneity 
Ob VanlanGes: 3% oc es oe. oe ew Rate Me Sea es 664 
XXIV Random Numbers ..............--.-+--+-- 665 
Charts 
I Power Functions of the Two-Sided Student’s t Test ..... 669 
II Power Functions of the Analysis of Variance F Tests 
(Fixed Effects Model): Pearson-Hartley Charts ........ 672 
Ill Operating Characteristic Curves for the Analysis of Variance 
F Tests (Random Effects Model) ................ 681 


IV Curves of Constant Power for Determination of Sample 
Size in a One-Way Analysis of Variance (Fixed 


Effects Model): Feldt-Mahmoud Charts ............ 686 
References: «6 hb ee 4 8 oR ROD OHS ES ESS 689 
Avithor- INdéX..:. o3. a6 & B24 eS Bod A Se SRE BR SH BA 717 


Subject Index: 2682 ek Gece’ PREM ESE SAREE OS CRS ea ae 725 


List of Tables 


Table 2.1 
Table 2.2 


Table 2.3 


Table 2.4 
Table 2.5 


Table 2.6 


Table 2.7 
Table 2.8 


Table 2.9 


Table 2.10 
Table 2.11 
Table 2.12 
Table 2.13 
Table 2.14 


Table 3.1 
Table 3.2 
Table 3.3 
Table 3.4 
Table 3.5 
Table 3.6 
Table 3.7 
Table 3.8 
Table 3.9 
Table 4.1 


Table 4.2 
Table 4.3 


Analysis of Variance for Model (2.1.1) ............ 26 
Analysis of Variance for Model (2.1.1) with Unequal 

Sample SIZES* sda. eee ee eke Gea oo UR EA 37 
Data on Yields of Four Different Varieties of Wheat | 
(i: BUSHEIS MEF ACKE)! anes aieoce ae, dete ey bane We Aes 39 
Analysis of Variance for the Yields Data of Table 2.3... . 41 
Data on Blood Analysis of Animals Injected with 

Five DMSS 4s oe os ba Ne ate Seas Be ee eS 4] 
Analysis of Variance for the Blood Analysis Data 

OF PADIC 2S. oe ahs, ie acl a hw So qeom: op SS A a Bede eh 43 
Interview Ratings by Five Staff Members........... 43 
Analysis of Variance for the Interview Ratings Data 

OF WADC DF has heh See oe eee, BR ee te a 45 
Data on Yields of Six Varieties of Corn 

(in bushels per acre) .............0.0 220008. 48 
Analysis of Variance for the Yields Data of Table 2.9... . 50 
Pairwise Differences of Sample Means jy;,— yy ....... wp 
Calculations for Shapiro-Francia’s Test ............ 95 
Calculations for Bartlett’s Test... ............. 101 
Data on Log-bids of Five Texas Offshore Oil and 

Gas Leases............-2..000 000s ile ot earch — 103 
Data for a Two-Way Experimental Layout ......... 126 
Analysis of Variance for Model (3.1.1) ........... 134 
Loss in Weights Due to Wear Testing of Four Materials 
CNEIG): 4.05 Dae eee eee te ee ne ed 151 
Analysis of Variance for the Weight Loss Data 

OF Tape 323 0 52 .ece ak EOS etd Bet een wend ae Te 153 
Number of Minutes Observed in Grazing.......... 154 
Analysis of Variance for the Grazing Data of Table 3.5. . 156 
Breaking Strength of the Plastics (in lbs.).......... 158 
Analysis of Variance for the Breaking Strength Data 

OF WADIC 367: 5. & % desice a} eee: Bed ts ae ee 160 
Analysis of Variance for the Weight Loss Data of 

Table 3.3 with one Missing Value ...... ch os Bos 2 163 
Data for a Two-Way Crossed Classification with — 

n Replications perCell................2004. 178 
Analysis of Variance for Model (4.1.1) ........... 194 
Test Statistics for Models I, II, and TII............ 203 


XXX 


Table 4.4 
Table 4.5 


Table 4.6 
Table 4.7 
Table 4.8 
Table 4.9 
Table 4.10 


Table 4.11 


Table 4.12 


Table 4.13 
Table 4.14 


Table 4.15 
Table 4.16 


Table 4.17 
Table 4.18 
Table 5.1 
Table 5.2 
Table 5.3 
Table 5.4 
Table 5.5 
Table 5.6 
Table 5.7 


Table 5.8 
Table 5.9 


Table 5.10 


Table 5.11 
Table 5.12 


List of Tables 


Data for a Two-Way Crossed Classification with 


ni; Replications per Cell... 2.2... ...---2---.. 214 
Analysis of Variance for the Unbalanced Fixed Effects 
Model in (4.1.1) with Proportional Frequencies ...... 216 


Unweighted Means Analysis for the Unbalanced Fixed 

Effects Model in (4.1.1) with Disproportional 

PICGUENCIES. £22) sinwGed ewes eeete eh oes 218 
Weighted Means Analysis for the Unbalanced Fixed 

Effects Model in (4.1.1) with Disproportional 


PECGUENCIES: 3:4n<2 VU Fe cite Rohe sb us oA SSeS 221 
Analysis of Variance for Model (4.10.11).......... 226 
Running Time (in Seconds) to Complete a 1.5 Mile 

COUISE:-2 5 USA ee Reis 4 CES ESLER ee ee ees 234 
Analysis of Variance for the Running Time Data 

OP TAVIE SD a5 ns O26. Oe ke Eee WAS SS 236 
Weight Gains (in grams) of Rats under Different Diets 

(Data Made Unbalanced by Deleting Observations)... . 238 


Analysis of Variance Using Unweighted and Weighted 
Squares-of-Means Analysis of the Unbalanced Data on 


Weight Gains (in grams) of Table4.1]1 ........... 243 
Screen Lengths (in inches) from a Quality 

COnirolEXpenment: «3.4. ¢,44-4 4 eww dais eee ss 245 
Analysis of Variance for the Screen Lengths Data of 

TaDleA IS” 2.04 ae oe Odes OD CARS ie PE ee 246 
Yield Loads for Cement Specimens ............. 248 
Analysis of Variance for the Yield Loads Data of 

TADICA. 1D xs.-5.a:00 esl des ie oa Sh re ee ey at 250 
Analysis of Variance for Model (4.20.1) .......... 261 
Analysis of Variance for an Alternate Mixed Model ... 265 
Data for a Three-Way Crossed Classification with 

# Replications per Cells gave gavacee Sek ee eee 2 282 
Analysis of Variance for Model (5.1.1) ........... 287 


Tests of Hypotheses for Model (5.1.1) under Modell... 289 
Tests of Hypotheses for Model (5.1.1) under Model II . . 292 
Tests of Hypotheses for Model (5.1.1) under Model III 


(A Fixed, BandC Random) ................. 293 
Tests of Hypotheses for Model (5.1.1) under Model III 

(A Random, B and C Fixed) ................. 294 
Estimates of Parameters and Their Variances under 

MOU Gl dic: cteut & wate, Re Ae Ee ew Ras oe OR aS 295 
Analysis of Variance for Model (5.10.1) .......... 303 
Analysis of Variance for the Unbalanced Fixed Effects 

Model in (5.1.1) with Proportional Frequencies ..... . 314 
Data on Average Resistivities (in m-ohms/cm?) for 
Electrolytic Chromium Plate Example ........... 316 
Cell Totals yijx 2... ee, ie. asd: he Gx tod me Bee ae 317 


Sums over Levels of Degrasing yjj,. ..........-4- 317 


List of Tables 


Table 5.13 
Table 5.14 
Table 5.15 


Table 5.16 
Table 5.17 
Table 5.18 
Table 5.19 
Table 5.20 
Table 5.21 
Table 5.22 
Table 5.23 
Table 5.24 
Table 5.25 
Table 6.1 
Table 6.2 
Table 6.3 
Table 6.4 
Table 6.5 
Table 6.6 
Table 6.7 
Table 6.8 


Table 6.9 
Table 6.10 


Table 6.11 
Table 6.12 


Table 7.1 
Table 7.2 


Table 7.3 
Table 7.4 
Table 7.5 
Table 7.6 


Table 7.7 


XXXi 
Sums over Levels of Time y;x,, 2... .......20200. 317 
Sums over Levels of Temperature y jx. ........-.. 318 
Analysis of Variance for the Resistivity Data 
OF PADIS 31 Oa nag eens ae Bk, Sok AB ae wd ae a 320 
Data on the Melting Points of a Homogeneous 
Sample of Hydroquinone ................... 323 
Sums over Analysts yj, 2... ee 323 
SUIS OVEr WEEKS: Vie wie eRe Sarak ores Aas eA 324 
Sums over Thermometer y jg .........-2.02-0004 324 
Analysis of Variance for the Melting Point Data 
Of Table D.1Gs.2.. oe 3.3 bee ee Gee st ols de ae 326 
Data on Diamond Pyramid Hardness Number of Dental 
Fillings Made from Two Alloys of Gold .......... 328 
SUMS Over DenustS Vie sy cure ek we CS a a 329 
Sums over Methods yj, ..........2. 00002 eee 330 
Sums over Alloys y jz, 2.6... oe ee ee ee 330 
Analysis of Variance for the Gold Hardness Data 
OF Vable 3724.4 sh ee Dk EHS We OS Oe ee BAS 332 
Data for a Two-Way Nested Classification ......... 349 
Analysis of Variance for Model (6.1.1) ........... 352 
Point Estimates and Their Variances under Model III. . . 357 
Analysis of Variance for Model (6.1.1) Involving Unequal 
Numbers in the Subclasses .................. 364 
Weight Gains of Chickens Placed on Four Feeding 
IEALIMG INS 2 oi o05. by waa @ kd & ay ade ae ee eae & 4 368 
Analysis of Variance for the Weight Gains Data 
OF TableO:5: 2c os Oe BAS PAO RRC 370 
Moisture Content from Two Analyses on Two 
Samples of 15 Batches of Pigment Paste .......... 372 
Analysis of Variance for the Moisture Content 
of Pigment Paste Data of Table 6.7.............. 373 
Data on Weight Gains from a Breeding Experiment. . . . 375 
Analysis of Variance for the Weight Gains Data 
Ol Table OD. 2.4%. es ae oa eh eee oa ESS 377 


Average Daily Weight Gains of Two Pigs of Each Litter . 379 
Analysis of Variance for the Weight Gains Data 


OF Pable Gilet ou a keueeiw ods Be dees wale 380 
Analysis of Variance for Model (7.1.1) ........... 398 
Analysis of Variance for Model (7.1.1) Involving Unequal 
Numbers in the Subclasses .................. 402 
Analysis of Variance for Model (7.5.1) ........... 405 
Percentage of Ingredient of a Batch of Material ...... 408 
Analysis of Variance for the Material Homogeneity Data 

Of, PADICT TA 5, hae ste BA PS eek Bae ed Grey 410 
Number of Diatoms per Square Centimeter 

Colonizing Glass Slides ...................0. 412 


Analysis of Variance for the Diatom Data of Table 7.6 .. 415 


XXXii 


Table 7.8 
Table 7.9 
Table 8.1 
Table 8.2 
Table 8.3 


Table 8.4 


Table 8.5 
Table 8.6 


Table 9.1 
Table 9.2 
Table 9.3 
Table 9.4 
Table 9.5 
Table 9.6 
Table 10.1 
Table 10.2 


Table 10.3 
Table 10.4 


Table 10.5 
Table 10.6 


Table 10.7 


Table 10.8 
Table 10.9 


Table 10.10 


Table 10.11 
Table 10.12 


Table 10.13 
Table 10.14 


Table 10.15 
Table 10.16 


List of Tables 


Glycogen Content of Rat Livers in Arbitrary Units .... 418 
Analysis of Variance for the Glycogen Data of Table 7.8. 420 
Analysis of Variance for Model (8.1.1) ........... 436 
Analysis of Variance for Model (8.4.1) ........... 440 
Analysis of Variance for the Calcium Content of Turnip 
Leaves: DAG 2 ce sikete pitenwe Ge he eed Sia 44] 
Measured Strengths of Tire Cords from Two Plants 

Using Different Production Processes ............ 444 
Calculation of Cell and Marginal Totals........... 446 
Analysis of Variance for the Tire Cords Strength Data 

OP LADIES .4e 21.346 Os be eae eee ee Dore 449 
Analysis of Variance for Model (9.1.1) ........... 462 
Analysis of Variance for Model (9.2.1) ........... 464 
Expected Mean Squares for Model (9.3.1) ......... 472 
Analysis of Variance for Model 9.5.1) ........... 475 
Production Output from an Industrial Experiment... . . 476 
Analysis of Variance for the Production Output 

Dataor Vaples9. Sic. ee oh os Sata ee ete: @ eso Se AT] 
Analysis of Variance for the Completely Randomized 

Design with Equal Sample Sizes ............... 486 
Analysis of Variance for the Completely Randomized 

Design with Unequal Sample Sizes ............. 487 


Data on Weights of Mangold Roots in a Uniformity Trial 488 
Analysis of Variance for the Data on Weights of 

Mangold ROOts 2 v.40: aii 4rt:6 dS aw wae BAG oS He 488 
Analysis of Variance for the Randomized Block Design . 491 
Data on Yields of Wheat Straw from a Randomized 


Block: Expenment. 2.7646 s.26 Soe Re we SARS OF 494 
Analysis of Variance for the Data on Yields of 

Whedt. Straws 5-244 24 ee bGa we de bed ee Sees 495 
Analysis of Variance for the Latin Square Design .... . 499 
Data on Responses of Monkeys to Different Stimulus 
CONGIUONS <4 ds boise Bis SOS R Ee oe ORS eS 505 
Analysis of Variance for the Data on Responses of 

Monkeys to Different Stimulus Conditions ......... 505 


Analysis of Variance for the Graeco-Latin Square Design 509 
Data on Photographic Density for Different Brands of 


FlashsBulos: 3s xs ote Bara oe dt ee tere: 2 510 
Analysis of Variance for the Data on Photographic 

Density for Different Brands of Flash Bulbs ........ 510 
Analysis of Variance for the Split-Plot Design. ...... 515 
Data on Yields of Two Varieties of Soybean ........ 517 


Analysis of Variance for the Data on Yields of Two 
Varieties: Of Soybean... ie kbu ee eee 44 ee ca e4 517 


List of Figures 


Figure 2.1 
Figure 2.2 


Figure 2.3 


Figure 2.4 


Figure 2.5 


Figure 2.6 


Figure 2.7 


Figure 3.1 


Figure 3.2 


Figure 3.3 


Figure 4.1 


Figure 4.2 


Schematic Representation of Model Iand Model II..... 13 
Program Instructions and Output for the One-Way 

Fixed Effects Analysis of Variance: Data on Yields 

of Four Different Varieties of Wheat (Table 2.3)....... 53 
Program Instructions and Output for the One-Way Fixed 
Effects Analysis of Variance with Unequal Numbers of 
Observations: Data on Blood Analysis of Animals Injected 
with Five Drugs (Table 2.5) ................... 54 
Program Instructions and Output for the One-Way 

Random Effects Analysis of Variance: Data on Interview 
Ratings by Five Staff Members (Table 2.7).......... 55 
Program Instructions and Output for the One-Way 

Random Effects Analysis of Variance with Unequal 

Numbers of Observations: Data on Yields of Six Varieties 


of Corn: (lablé 2.9): 3-0. eek aiid le eR we SEE RRA 56 
Curves Exhibiting Positive and Negative Skewness 

and Symmetrical Distribution. ................. 91 
Curves Exhibiting Positive and Negative Kurtosis 

and the Normal Distribution................... 92 


Program Instructions and Output for the Two-Way Fixed 
Effects Analysis of Variance with One Observation 

Per Cell: Data on Loss in Weights Due to Wear Testing 

of Four Materials (Table 3.3) ................. 165 
Program Instructions and Output for the Two-Way 

Random Effects Analysis of Variance with One 

Observation per Cell: Data on Number of Minutes 

Observed in Grazing (Table 3.5) ............... 166 
Program Instructions and Output for the Two-Way Mixed 
Effects Analysis of Variance with One Observation 

per Cell: Data on Breaking Strength of the 

Plastics (Table 3:7): neat are od ook te hens A as wee 167 
Program Instructions and Output for the Two-Way 

Fixed Effects Analysis of Variance with Two 

Observations per Cell: Data on Running Time (in seconds) 

to Complete a 1.5 Mile Course (Table 4.9) ......... 254 
Program Instructions and Output for the Two-Way Fixed 
Effects Analysis of Variance with Unequal Numbers of 


XXXill 


XXXIV 


Figure 4.3 


Figure 4.4 


Figure 4.5 


Figure 5.1 


Figure 5.2 


Figure 5.3 


Figure 6.1 


Figure 6.2 


Figure 6.3 


Figure 6.4 


Figure 6.5 


Figure 6.6 


Figure 7.1 


List of Figures 


Observations per Cell: Data on Weight Gains of Rats 

under Different Diets (Table 4.11) .............. 255 
Program Instructions and Output for the Two-Way 

Random Effects Analysis of Variance with Two 

Observations per Cell: Data on Screen Lengths from 

a Quality Control Experiment (Table 4.13) ......... 256 
Program Instructions and Output for the Two-Way Mixed 
Effects Analysis of Variance with Three Observations 

per Cell: Data on Yield Loads for Cement Specimens 


(lable4. 13) Sas Ge. ot ae ete ees eee GS 258 
Patterns of Observed Cell Means and Existence or 
Nonexistence of Interaction Effects ............. 260 


Program Instructions and Output for the Three-Way 

Fixed Effects Analysis of Variance: Data on Average 
Resistivities (in m-ohms/cm°) for Electrolytic Chromium 
Plate Example (Table 5.10) .........--....... 334 
Program Instructions and Output for the Three-Way 

Random Effects Analysis of Variance: Data on the Melting 
Points of a Homogeneous Sample of Hydroquinone 

(ADIGE DONO): e408 2) bets a4, wd he ak eos Sears oe 335 
Program Instructions and Output for the Three-Way Mixed 
Effects Analysis of Variance: Data on Diamond 

Pyramid Hardness Number of Dental Fillings Made 


from Two Alloys of Gold (Table 5.21) ........... 337 
A Layout for the Two-Way Nested Design Where 
Barrels Are Nested within Locations ............ 348 
A Layout for the Two-Way Nested Design Where 
Spindles Are Nested within Machines............ 348 


Program Instructions and Output for the Two-Way 

Fixed Effects Nested Analysis of Variance: Weight 

Gains Data for Example of Section 6.11 (Table 6.5) ... 382 
Program Instructions and Output for the Two-Way 

Random Effects Nested Analysis of Variance: Moisture 
Content of Pigment Paste Data for Example 

of Section 6.12 (Table 6.7) .................. 383 
Program Instructions and Output for the Two-Way 

Random Effects Nested Analysis of Variance with 

Unequal Numbers in the Subclasses: Breeding Data 

for Example of Section 6.13 (Table 6.9). .......... 384 
Program Instructions and Output for the Two-Way Mixed 
Effects Nested Analysis of Variance: Average Daily 

Weight Gains Data for Example of Section 6.14 

CTAB lesG ie et sant oa ie, eae eS ene 8 ae Bias 385 
A Layout for the Three-Way Nested Design Where 

Barrels Are Nested within Vats and Samples Are Nested 
Within Barrels: 503. BPS, ee Bey BY ot eS 396 


List of Figures 


Figure 7.2 


Figure 7.3 


Figure 7.4 


Figure 8.1 


Figure 8.2 


Figure 10.1 


Figure 10.2 
Figure 10.3 


Figure 10.4 
Figure 10.5 


Figure 10.6 
Figure 10.7 


Figure 10.8 
Figure 10.9 


XXXV 


Program Instructions and Output for the Three-Way 

Random Effects Nested Analysis of Variance: Material 
Homogeneity Data for Example of Section 7.7 

(Tale 1A). x deohay, B e thtipn g ah ae oh etd ce Ae eed aoe 421 
Program Instructions and Output for the Three-Way 

Random Effects Nested Analysis of Variance with 

Unequal Numbers in the Subclasses: Diatom Data for 
Example of Section 7.8 (Table 7.6) ............. 423 
Program Instructions and Output for the Three-Way 

Mixed Effects Nested Analysis of Variance: Glycogen 

Data for Example of Section 7.9 (Table 7.8) ........ 424 
A Layout for the Partially Nested Design Where Days 

Are Crossed with Methods and Operators Are Nested 

Within Methods ids. bs 6 ah ack Se Se ee ein 432 
Program Instructions and Outputs for the Partially Nested 
Mixed Effects Analysis of Variance: Measured Strengths 

of Tire Cords from Two Plants Using Different Production 
Processes (Table 8.4). .......0...0.......004. 450 
Program Instructions and Output for the Completely 
Randomized Design: Data on Weights of Mangold 

Roots in a Uniformity Trial (Table 10.3) .......... 489 
A Layout of a Randomized Block Design ......... 490 
Program Instructions and Output for the Randomized 

Block Design: Data on Yields of Wheat Straw from a 
Randomized Block Design (Table 10.6). .......... 496 
Some Selected Latin Squares ................. 497 
Program Instructions and Output for the Latin Square 

Design: Data on Responses of Monkeys to Different 

Stimulus Conditions (Table 10.9) ..........002.. 506 
Some Selected Graeco-Latin Squares ............ 507 
Program Instructions and Output for the Graeco-Latin 

Square Design: Data on Photographic Density for 

Different Brands of Flash Bulbs (Table 10.12) ....... 511 
A Layout of a Split-Plot Design .......... ree 
Program Instructions and Output for the Split-Plot 

Design: Data on Yields of Two Varieties of Soybean 
ClaBleTO.NS):. iu doa: a ee tse ee ak eee aE Ree RSs 518 


Introduction 


1.0 PREVIEW 


The variation among physical observations is a common characteristic of all 
scientific measurements. This property of observations, that is, their failure to 
reproduce themselves exactly, arises from the necessity of taking the observa- 
tions under different conditions. Thus, in a given experiment, readings may 
have to be taken by different persons at different periods of time or under dif- 
ferent operating or experimental conditions. For example, there may be a large 
number of external conditions over which the experimenter has no control. 
Many of these uncontrolled external conditions may not affect the results of the 
experiment to any significant degree. However, some of them may change the 
outcome of the experiment appreciably. Such external conditions are commonly 
known as the factors. 

The analysis of variance methodology is concerned with the investigation of 
the factors likely to contribute significant effects, by suitable choice of exper- 
iments. It is a technique by which variations associated with different factors 
or defined sources may be isolated and estimated. The procedure involves the 
division of total observed variation in the data into individual components at- 
tributable to various factors and those due to random or chance fluctuation, and 
performing tests of significance to determine which factors influence the ex- 
periment. The methodology was originally developed by Sir Ronald A. Fisher 
(1918, 1925, 1935) who gave it the name of “analysis of variance.” The analy- 
sis of variance is the most widely used tool of modern (post-1950) statistics by 
research workers in the substantive fields of biology, psychology, sociology, ed- 
ucation, agriculture, engineering, and so forth. The demand for knowledge and 
development of this topic has largely come from the aforementioned substan- 
tive fields. The development of analysis of variance methodology has in turn 
affected and influenced the types of experimental research being carried out in 
many fields. For example, in quantitative genetics which relies extensively on 
separating the total variation into environmental and genetic components, many 
of the concepts are directly linked to the principles and procedures of the anal- 
ysis of variance. Nowadays, the analysis of variance models are widely used to 
analyze the effects of the independent variables under study on the dependent 
variable or response measure of interest. 

The general synthesis of the analysis of variance procedure can be sum- 
marized as follows. Given a collection of n observations y;’s, we define the 


I. QCahay ak Th Nay alate At Vianna 
H. Sahai et al., The Analysis of Variance 1 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


2 The Analysis of Variance 


aggregate variation, called the total sum of squares, by 


S01 - dY, 
i=l 


where 


Then the technique consists of partitioning the total sum of squares into compo- 
nent variations due to different factors also called sums of squares. For example, 
suppose there are Q such factors. Then the total sum of squares (SS7) is parti- 
tioned as 


SS7 = SS, +SS, +---+SSo, 


where SS,4, SSz,..., and SSg represent the sums of squares associated with 
factors A, B,..., and Q, respectively and which account in some sense for 
the variation that can be attributed to these factors or sources of variation. The 
experiment is so designed that each sum of squares reflects the effect of Just 
one factor or that of the random error attributed to chance. Furthermore, all 
such sums of squares are made comparable by dividing each by an appropriate 
number known as the degrees of freedom.! The quantities obtained by dividing 
each sum of squares by the corresponding degrees of freedom are called mean 
squares. The mean squares and other relevant quantities provide the basis for 
making statistical inference either in terms of a test of hypothesis or point and 
interval estimation. For example, the random distribution of the appropriate 
ratio of a pair of mean squares often permits certain tests to be made about the 
effects of the factors involved. In this way, we can decide whether the apparent 
effect of any one factor is readily explainable purely by chance. 

The use of the term “analysis of variance” to describe the statistical method- 
ology dealing with the means of a variable across groups of observations is not 
wholly satisfactory. The term seems to have its origin in the work of Fisher 
(1918) when he partitioned the total variance of a human attribute into com- 
ponents attributed to heredity, environment, and other factors; this led to an 
equation 


2 2 2 2 
O° =O; +0, +---+0;, 
where o°% is the total variance and o?’s are variances associated with different 


| The degrees of freedom designate the number of statistically independent quantities (response 
variables) that comprise a sum of square (see Section 2.1 for further details). 


Introduction 3 


factors. In such a case, the term “analysis of variance” is entirely appropriate. 
However, as Kempthorne (1976) states, this is rather a limited view of the entire 
statistical methodology falling under the nomen of “analysis of variance.” 


1.1 HISTORICAL DEVELOPMENTS 


Credit for much of the early developments in the field of the analysis of variance 
goes to Sir Ronald Aylmer Fisher? (1890—1962) who originally developed it in 
the 1920’s.* He was the pioneer and innovator of the uses and applications of 
statistical methods in experimental design. Shortly after the end of World War 
I, Fisher resigned a public school teaching job and accepted a position at the 
Statistical Laboratory of the Rothamsted Agricultural Experimental Station in 
Harpenden, England. The center was heavily engaged in agricultural research 
and for many years Fisher was in charge of statistical data analysis there. He 
developed and employed the analysis of variance as the principal method of 
data analysis in experimental design. Frank Yates was Fisher’s coworker at 
Rothamsted, and both of them collaborated on many important research projects. 
Yates also was primarily responsible for making many early contributions to 
the literature on analysis of variance and experimental design. Moreover, both 
Fisher and Yates made numerous indirect contributions through the staff of the 
Rothmsted Experimental Station. 

On the retirement of Karl Pearson in 1933, Fisher was appointed Galton Pro- 
fessor at the University of London. Later, he moved to the faculty of Cambridge 
University to accept the Arthur Balfour Chair created by an endowment from 
the great founder of the science of eugenics. He also traveled abroad widely 
and held visiting professorships at several universities throughout the world. In 
1936, Fisher visited the United States and received an honorary degree from 
the Harvard University on the occasion of its Tercentenary Celebrations. Early 
in 1937, he accepted the Honorary Fellowship of the Indian Statistical Insti- 
tute and an invitation to preside over the first session of the Indian Statistical 
Congress held in January, 1938. In 1935, Fisher published a book, The Design of 
Experiments, in which logical and theoretical principles of experimental design 
were developed and expounded with a great variety of illustrative examples. 
The book has gone through numerous editions and has become a classic of 
statistical literature. 

Fisher developed and used analysis of variance mainly during the 1920s and 
1930s (see Fisher (1918, 1925, 1935)) for the study of data from agricultural ex- 
periments. From the beginning, Fisher employed both fixed effects and random 


* Kendall et al. (1983, p. 2) point out that it would be more appropriate to call it “analysis of sum 
of squares,” “but history and brevity are against this logical usage.” 

3 The brief biographical sketch of Fisher given here has been drawn, plagiaristically in some ways, 
from Mahalanobis (1962). | 

4 There had been some early work on analysis of variance carried out by W. H. R. A. Lexis and 
T. N. Thiele in the late nineteenth century (Kendal et al. (1983, p. 2)). 


4 The Analysis of Variance 


effects models (the latter, at least, when he treated intraclass correlations), and 
alluded to what has later been designated a “mixed model” in his book The De- 
sign of Experiments. The useful “analysis of variance table,’ including sum of 
squares, degrees of freedom, and mean squares for various sources of variation, 
was first published in a paper by Fisher and Mackenzie (1923). Tippett (1931) 
appears to have been the first to include a column of expected mean squares for 
variance components and to estimate these components. 

Eisenhart (1947a) seems to have originated the terms “Model I’ and “Model 
II” for fixed effects and random effects models, and he also mentions the pos- 
sibility of a mixed model. Mixed models have been designated as Model III 
(see, e.g., Ostle and Malone (1988)). Some authors use Model III to denote 
random effects models in which random effects are drawn from a finite pop- 
ulation of possible values (see, e.g., Dunn and Clark (1974, Chapter 9)). We 
have referred to these models as finite population models (see Chapter 9). Prior 
to Eisenhart (1947a), more formal treatments of the models had appeared in the 
papers by Daniels (1939) and Crump (1946) regarding random effects models, 
and by Jackson (1939) regarding the mixed model. However, a more complete 
treatment of the mixed model did not appear until Scheffé (1956a). The devel- 
opment of various models was very intensive in the 1940s and 1950s. These 
early developments in this field can be found in expository articles by Crump 
(1951) and Plackett (1960). 

Most of the early applications of analysis of variance were in the field of 
agricultural sciences. Fisher’s application of statistical theory in agricultural 
experiments brought many new developments and advances in the field. In fact, 
much of modern statistics originated to meet the research needs of agriculture 
experimental stations, and it is to this legacy that much of the terminology in 
the field is derived from agricultural experimentations. However, many of the 
experimental design terms such as “treatment,” “plot,” and “block” used earlier 
in an agricultural context, have lost their original meaning and are nowadays 
used in all areas of research. 

Today, the methods of experimental design and analysis of variance are com- 
monly used in nearly all fields of study and scientific investigation. Some of 
the disciplines where statistical design and analysis of experiments are rou- 
tinely used include agriculture, biology, medicine, health, physical sciences, 
engineering, education, and social and behavioral sciences. 


1.2 ANALYSIS OF VARIANCE MODELS 


It is assumed that an analysis of variance model for the observations can be 
approximated by linear combinations (functions) of certain unobservable quan- 
tities known as “effects” corresponding to each factor of the experiment. The 
effects may be of two kinds: systematic or random. If the effect is systematic, 
it is called a fixed effect or Model I effect; otherwise it is called a random 
effect or Model II effect. The equation expressing the observations as a linear 
combination of the effects is known as a linear model. 


Introduction 5 


In every linear model there is at least one set of random effects, equal in 
number to the number of observations, a different one of which appears in 
every observation. This is called the residual effect or the error term. Further- 
more, there is usually one fixed effect that appears in every model equation 
and is known as the general constant and is the mean, in some sense, of all the 
observations. Thus, a general linear model is represented as? 


Vijk.g = M+; + Bj + VE +++ + Cijk..gs (1.2.1) 


where Yyijx...g is the observed score, —oo < ys < cols the overall mean common 
to all the observations, a;, Bj, yx, ... are unobservable effects attributable to 
different factors or sources of variation, and e;;,...g 1s an unobservable random 
error associated with the observation y;jx,..¢ and is assumed to be independently 
distributed with mean zero and variance a2. Model (1.2.1) is called a fixed ef- 
fects model, or Model I if the random effects in the model equation are only 
the error terms. Thus, under a fixed effects model, the quantities @;, Bj, ¥x,..- 
are assumed to be constants. The objective in a fixed effects model is to make 
inferences about the unknown parameters 1, a@;, Bj, Ye, ---, and o2. It is called 
a random effects model or a variance components model or Model II if all the 
effects in the model equation except the additive constant are random effects. 
Thus, under arandom effects model, the quantities a;, Bj, yz, ... are assumed to 
be random variables with means of zero and variances 0, re a; ..., respec- 
tively. The objective in a random effect model is to make inferences about the 
variances 02, OR oF, ..., 02 and/or certain functions of them. A case falling 
under none of these categories is called a mixed model or Model III. Thus, in 
a mixed model, some of the effects in the model equation are fixed and some 
are random. Mixed models contain a mixture of fixed and random effects and 
therefore represent a blend of the fixed and the random effects models. Mixed 
models include, as special cases, the fixed effects model in which all effects are 
assumed fixed except the error terms, and the random effects model in which 
all effects are assumed random except the general constant. The objective in a 
mixed model is to make inferences about the fixed effect parameters and the 
variances of the random effects. There are widespread applications of such 
models in a variety of substantive fields including genetics, animal husbandry, 
social sciences, and engineering. 

Throughout this volume we consider analysis of variance based on Models I, 
II, and III. These are the most widely applicable models although other models 
have also been proposed (Tukey (1949a)). 


> Ina linear model it is customary to use a lower or an upper case Roman letter to represent a random 
effect and a Greek letter to designate a fixed effect. However, the practice is not universal, and 
some authors do just the opposite; that is, they use Greek letters to represent random effects and 
Roman letters to designate fixed effects (see, e.g., Kempthorne and Folks (1971, pp. 456-470)). 
In order to keep our notations simple and uniform, we use Greek letters to represent both fixed 
and random effects except the error or residual term which is denoted by the lower case Roman 
letter (e). 


6 The Analysis of Variance 


1.3. CONCEPT OF FIXED AND RANDOM EFFECTS 


Whether an effect should be considered fixed or random will depend on the 
way the experimental treatments (levels of a factor) are selected and the kind 
of inferences one wishes to make from the analysis. In the fixed effects case, 
the experimenter is working with a systematically chosen set of treatments or 
levels of factors that are of particular intrinsic interest, and the inferences are 
to be made only about differences among the treatments actually being studied 
and about no other treatments that might have been included. 

Thus, in the case of fixed effects, in advance of the actual experiment, the 
experimenter decides that he wants to see if differences in effects exist among 
some predetermined set of treatments or treatment combinations. His interest 
lies in these treatments or treatment combinations and no others. That is, treat- 
ments are chosen a priori by the researcher because they are of special interest 
to him. Each treatment of practicable interest to the experimenter has been in- 
cluded in the experiment, and the set of treatments or treatment combinations 
included in the experiment cover the entire set of treatments about which the 
experimenter wants to make inferences. The effect of any treatment is “fixed”’ 
in the sense that it must appear in any new trial of the experiment on other 
subjects or experimental units. Thus, model terms that represent treatments, 
blocks, and interactions are parameters. 

An example of fixed effects may be an experiment in which the effects of 
different drugs on groups of animals are examined. Here, the treatments (drugs) 
are fixed and determined by the experimenter, and the interest being in the results 
of the treatments and the differences between them. This is also the case when 
we test the effects of different levels (doses) of a given factor such as a chemical 
or the amount of light to which a plant has been exposed. Another example of 
fixed effects would be a study of body weights for several age groups of animals. 
The treatments (levels) would be age groups that are fixed. Other sets of factors 
or variables that are usually considered fixed are types of disease, treatment 
therapy, gender, marital and socioeconomic status, and so forth. 

In the random effects case, the experimenter is working with a randomly 
selected subset of treatments from a much larger population of all possible 
treatments about which the experimenter wants to make inferences. The subset 
of treatments included in the experiment do not exhaust the set of all possible 
treatments of interest. Thus, the treatment levels at which the experiment is 
conducted are not of interest in themselves, but rather they represent some 
of the many treatments on which the experiment could have been performed. 
Here, the effect of the treatment is not regarded as fixed, since any particular 
treatment itself need not be included each time the experiment is carried out. In 
such a case, in any repetition of the experiment a new sample of treatments is to 
be included. The experimenter may not actually plan to repeat the experiment, 
but conceptually each repetition involves a fresh sample of treatments. The 
interest of the researcher in such situations lies in determining whether different 
treatment levels yield different responses. 


Introduction 7 


For an example of random effects, suppose that in a psychological experiment 
the personality of the experimenter herself may have an effect on the results. 
There may be a large group of people, each presumably having a distinct person- 
ality, who might possibly serve as experimenters. Trying out each such person 
in the experiment would not be practically feasible. So, instead, one draws a 
random sample from some chosen population of potential experimenters. In 
this case, each experimenter constitutes an experimental treatment given to one 
group of subjects assigned to her at random. Since the experimental treatments 
employed are themselves a random sample and inferences are to be made about 
experimenter effects extending to the population of potential experimenters, the 
effects are regarded as random. Other examples of factors or variables that are 
usually considered random are animals, days, subjects, plots, and so forth. 

It is sometimes difficult to decide whether a given factor is fixed or ran- 
dom. The main distinction depends on whether the levels of the factor can be 
considered a random sample of a large collection of such levels of interest, 
or are fixed treatments whose differences the investigator wishes to investi- 
gate. Moreover, in many experimental studies involving the levels of a random 
factor, the researcher does not actually select a random sample from a large 
population of all the levels of interest. For example, animals, subjects, days, 
and the like, being studied are the ones that happen to be conveniently available. 
The usual assumption in such a situation 1s that the natural process leading to 
the availability of such cases is a random one involving no systematic bias and 
the cases being studied are sufficiently representative of their class type. How- 
ever, if there are reasons to doubt the representativeness of the sample being 
studied, the estimation process will be biased, raising serious questions about 
the validity of the results. 


1.4 FINITE AND INFINITE POPULATIONS 


As mentioned earlier, in the case of random effects, treatments included in the 
experiment are usually assumed to be a random sample from a population of all 
possible treatments of interest. Such populations are generally considered to be 
of infinite size. However, in the definition of random effects it is not necessary 
to require that the population be of infinite size; it also could be finite. For 
example, a set of similar machines may be used for certain operations but 
measurements are obtained from only a limited number of them. In the case of 
a finite population, however, the population may be very large so that for all 
practical purposes the population may be considered to be infinite. 

Thus, while considering random effects, one must distinguish between two 
cases: when the population is finite and when it is infinite — either because 
it is really so or because the finite population is sufficiently large so that for 
all practical purposes, it can be considered as infinite. In this volume we are 
primarily concerned with random effects arising out of infinite populations. 
These are probably the most common situations in many real-life applications. 


8 The Analysis of Variance 


The results dealing with the so-called finite population theory are treated only 
briefly. In the case of the so-called finite population theory, the analysis of 
variance is performed in the standard way but the results on expected mean 
squares are different. Knowing the expected mean squares, one can then decide 
which of the ratios should be used to test for the statistical significance of the 
factors or sources of variation of interest. The details on analyses and results 
covering finite population situations can be found in the works of Bennett and 
Franklin (1954, p. 404), McHugh and Mielke (1968), Gaylor and Hartwell 
(1969), Searle and Fawcett (1970), and Sahai (1974a). 


1.5 GENERAL AND GENERALIZED LINEAR MODELS 


In this volume and in many other books dealing with the analysis of variance 
and regression models, much of the theory and methodology being described 
makes the fundamental assumption of the normality of the error terms. The 
analysis of variance models along with the linear regression models are com- 
monly known as the linear (statistical) models. The linear (statistical) models 
considered in this book are just one particular formulation of a model called 
the general linear model which presents a unified treatment of regression and 
analysis of variance models. A comprehensive treatment of statistical analysis 
based on general linear models can be found in Graybill (1961, 1976), Searle 
(1971b, 1987), Hocking (1985, 1996), Littell et al. (1991), Wang and Chow 
(1994), Rao and Toutenburg (1995), Christensen (1996), and Neter et al. (1990, 
1996). Furthermore, in recent years, considerable effort has been devoted to the 
development of a wider class of linear models encompassing other probability 
distributions for their error structure. For example, in many medical and epi- 
demiological works, a common type of outcome measure is a binary response 
and the mean response entails the binomial parameter. Similarly, in many health 
studies, the response variable is the observed number of cases from a certain rare 
disease, and, thus, the error structure has a Poisson distribution. Furthermore, 
in many environmental studies, the response variable often follows a gamma 
or inverse Gaussian distribution. The linear (statistical) models that allow for 
theory and methodology to be applicable to a much more general class of linear 
models, of which the normal theory is a special case, are known as generalized 
linear models. A considerable body of literature has been developed for these 
models and the interested reader is referred to the books by McCullagh and 
Nelder (1983, 1989), Dobson (1990), and Hinkley et al. (1991). 


1.6 SCOPE OF THE BOOK 


In this volume, we consider univariate analysis of variance models, that is, 
models with a single response variate. However, many applications of analysis 
of variance models involve simultaneous measurements on several correlated 
response variates and for statistical analysis of multivariate response data one 
uses multivariate analysis of variance, which generalizes the analysis of variance 


Introduction 9 


procedure for univariate normal populations to multivariate normal populations. 
The analysis of variance models with multivariate response are discussed in the 
works of Krishnaiah (1980), Anderson (1984), Morrison (1990), and Lindman 
(1992), among others. Similarly, statistical methods presented in this book 
are based on normal theory assumption for inference problems. The nonpara- 
metric analysis of variance procedures based on ranks, which do not require 
normal theory assumption, are not considered. There is a significant body of 
literature on nonparametric analysis of variance and the interested reader 1s re- 
ferred to Conover (1971), Lehmann (1975), Daniel (1990), Sprent (1997), and 
Hollander and Wolfe (1998), among others. Finally, the main focus of this book 
is on classical analysis of variance procedures, where parameters are assumed 
to be unknown constants, and the accepted statistical techniques are based on 
the frequentist theory of hypothesis testing developed by Neyman and Pearson. 
The impact of the frequentist approach is reflected in the established use of 
type I and type II errors, and p-values and confidence intervals for statistical 
analysis. In contrast, there is a growing body of literature on the use of Bayesian 
theory of statistical inference in linear modes. In the Bayesian approach, all pa- 
rameters are regarded as “random” in the sense that all uncertainty about them 
should be expressed in terms of a probability distribution. The basic paradigm 
of Bayesian statistics involves a choice of a joint prior distribution of all parame- 
ters of interest that could be based on objective evidence or subjective judgment 
or a combination of both. Evidence from experimental data 1s summarized by a 
likelihood function, and the joint prior distribution multiplied by the likelihood 
function is the (unnormalized) joint posterior density. The final (normalized) 
joint posterior distribution and its marginals form the basis of all Bayesian in- 
ference (see, e.g., Lee (1997)). The Bayesian method frequently provides more 
information and makes inference more readily attainable than the traditional 
frequentist approach. A drawback of the Bayesian approach ts that it tends to be 
computationally intensive and, even for simple unbalanced mixed models, the 
computation of the joint posterior density of the parameters and its marginals in- 
volves high-dimensional integrals. Fortunately, as a result of recent advances in 
computing hardware and numerical algorithms, Bayesian methods have gained 
wide applicability, and the scope and complexity of Bayesian applications have 
greatly increased. Readers interested in the Bayesian approach to analysis of 
variance problems are referred to Box and Tiao (1973), Broemeling (1985), 
Schervish (1992), and Searle et al. (1992), among others. 


One-Way Classification 


2.0 PREVIEW 


In this chapter we consider the analysis of variance associated with experiments 
having only one factor or experimental variable. Such an experimental layout 
is commonly known as one-way classification in which sample observations 
are classified (grouped) by only a single criterion. It provides the simplest 
data structure containing one or more observations at every level of a single 
factor. One-way classification is a very useful model in statistics. Many complex 
experimental situations can often be considered as one-way classification. In the 
succeeding chapters, we discuss situations involving two or more experimental 
variables. 


2.1 MATHEMATICAL MODEL 


Consider an experiment having a treatment groups or a different levels of a 
single factor. Suppose n observations have been made at each level giving a 
total of N = an observations. Let y;; be the observed score corresponding to 
the j-th observation at the i-th level or treatment group. The analysis of variance 
model for this experiment is given as 


yjuwuetat+e; (G=1,...,a;j =1,...,n), (2.1.1) 


where —0o < ft < 001s the general or overall mean (true grand mean) common 
to all the observations, a; is the effect due to the i-th level of the factor and 
e;; 1s the random error associated with the j-th observation, at the i-th level or 
treatment group. 

Model (2.1.1) states that the score for observation j at level 7 1s based on 
the sum of three components: the true grand mean yp of all the treatment 
populations, the effect w; associated with the particular treatment i, and a third 
part e;; which is strictly peculiar to the j-th observation made under the i-th 
level or treatment group. The term e;; takes into account all those factors that 
have not been included in model (2.1.1). 


2.2 ASSUMPTIONS OF THE MODEL 


Before one can use model (2.1.1) to make inferences about the existence of 
effects, certain assumptions must be made: 


[| Gahhai eral. The Anahiese at Varianre 
Fi. Sahai et al., The Analysis of Varlance 1 1 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


12 The Analysis of Variance 


(i) The errors e;;’s are assumed to be randomly distributed with mean zero 
and common variance 02. 
(ii) The errors associated with any pair of observations are assumed to be 


uncorrelated; that is, 


iZ’, GAL 
E(éij €;';') = 0 (2.2.1) 
isi jes’ 


(iii) Under Model I, the effects w;’s are assumed to be fixed constants subject 
to the constraint that )°5_, a; = 0. This implies that the observations 
yij’s are distributed with mean jz + a; and common variance a2. 

(iv) Under Model II, the effects a;’s are also assumed to be randomly dis- 
tributed with mean zero and variance Ge. Furthermore, a@;’s are uncorre- 
lated with each other and each of the @;’s and e;;’s are also uncorrelated, 


that is, 
E(ajav)=0, ii’ (2.2.2) 
and 


E(ajei;j) =0, for alli’s and j’s. 


Then from model (2.1.1), we have o? = o2 + G2 and so of and o7 


are components of o?, the variance of an observation. Hence, a2 and 


o? are called “components of variance” (see Appendix L). This implies 
that the observations y;;’s are distributed with mean and common 


variance o2 + o?. 


Remarks: (i) Under Model I, the assumption that }°;_, a; = 0 can be made without 
any loss of generality since if )“;_, @; were equal to some nonzero constant c, 4 could 
be replaced by 4 +c and }*;_, a; would then be equal to zero. Similarly, e;;’s could be 
assumed to have zero mean. Under Model II, as in the case of the fixed effects model, 
the assumption of a zero mean can also be made without any loss of generality. In this 
case, however, we do not assume that ys a; = 0 since the a;’s are randomly selected 
from a population theoretically of infinite size, and their sum need not be zero. Instead 
we have E(@?) = o7 and E(a;a;) = 0 fori 41’. 

(ii) Figure 2.1 is a schematic representation of the levels of a factor with a levels, 
under the assumptions of the fixed and random effects one-way analysis of variance 
model. Under the fixed effects (Model I) case, jz), U2,..., fq are the means of pres- 
elected subpopulations fixed by the design. Under the random effects (Model II) case, 
[1, 42,» ++ fg are the means of a randomly selected subpopulations from a population 
with mean yw and variance o?. 


One-Way Classification 13 


1st group 2nd group ' ath group 
2 e 
N(1,,62) N(p,,6; ) N(u,,02) 


1st group 2nd group a _ ath group 


N(Hy,0, +62) N(H).65 + o.) N(u,,02 +62) 


Model Il 


FIGURE 2.1 Schematic Representation of Model I and Model II. 


(iii) One-way classification can be regarded as a one-way nested classification where 
a factor corresponding to “replications” or “samples” is nested within the levels of 
the treatment factor. Such a layout is often termed a two-stage nested design since it 
involves random sampling performed in two stages. One-way random effects models fre- 
quently arise in experiments involving a two-stage nested design. Some examples are as 
follows: 


(a) Asamplea of batches of toothpaste is selected at random from a production pro- 
cess comprising a large number of batches and a chemical analysis to determine 
its composition is performed on n samples taken from each batch. 

(b) A sample a of tobacco leaves is selected from a shipment batch and a chemical 
analysis is performed onn samples taken from each leaf to determine the nicotine 
and tar content. 

(c) A sample a of blocks is selected from a city having a large number of blocks 
and an interview is conducted on a sample of n individuals from each block to 
determine their voting preferences. 

(d) A sample a of bales is selected from a shipment of bales and an analysis to 
determine its content purity is performed on n cores taken from each bale. 


14 The Analysis of Variance 


2.3 PARTITION OF THE TOTAL SUM OF SQUARES 


Using the notation that a bar on the top and a dot in place of a suffix means that 
the particular suffix has been averaged out over the appropriate observations, 
we write the following identity 


vig — ¥. = (Vi. — YD + Oy — Yi), (2.3.1) 
where 
yi. = yij/n = yi./n (2.3.2) 
j=l 
and 
y= s* yij/an = y_/an. (2.3.3) 


i=1 j=l 


Since it is an algebraic identity, equation (2.3.1) must hold for any value of yj;. 
Squaring both sides of (2.3.1) and summing over i and j, we have 


Y Lou — 5.) = Gi —5./ +L dow ye 


i=l j= i=1. j=) i=l j= 


+2 S* Y6.- — ¥. vi; — Ji). (2.3.4) 


i=l..j= 


Now, note that the cross-product term vanishes; that is, 


> Si. — I) -— i) = YO. =) YOu —¥.)=0 (2.3.5) 
=I j=l = = 


and 


>. -9.P = = 6, Hye (2.3.6) 


i 


Then, equation (2.3.4) simplifies to the following identity! 


> dO -— FP =2 G.-Y 4D VO - HP. 23-7) 
i=1 j=l i=] i=1 j=l 


| For a geometric interpretation of the algebraic partition of the sum of squares given by identity 
(2.3.7), see Kendall et al. (1983, p.11). 


One-Way Classification 15 


Equation (2.3.7) states that the sum of the squared deviations of individual 
observations from the overall mean is equal to the sum of squared deviations 
of the group means from the overall or grand mean plus the sum of squared 
deviations of the observations from the group means. The term on the left of 
(2.3.7) is called the total sum of squares and will be abbreviated as SS;. The 
first term to the right of (2.3.7) is called the sum of squares “between groups” 
or “due to treatments” and is abbreviated as SS; and the second term is called 
the sum of squares “within groups” and 1s abbreviated as SSy. The second term 
represents the variation of the individual observations about their own sample 
means and it is sometimes called the error or residual sum of squares. 


Remark: The meaning of the partition of the total sum of squares into between and 
within group sums of squares can be explained as follows. Individual observations in 
any sample will differ from each other or exhibit variability. These observed differences 
among individual observations can be ascribed to specific sources. First, some pairs of 
observations are in different treatment groups and their differences are due either to the 
different treatments, or to chance variation, or to both. The sum of squares between 
groups reflects the contribution of different treatments to intergroup differences as well 
as chance. On the other hand, observations in the same treatment group can differ 
only because of chance variation, since each observation within the group received 
exactly the same treatment. The sum of squares within groups reflects these intragroup 
differences due only to chance variation. Thus, in any sample, two kinds of variability — 
the sum of squares between groups, reflecting the variability due to treatments and 
chance, and the sum of squares within groups reflecting chance variation alone — can be 
isolated. 


2.4 THE CONCEPT OF DEGREES OF FREEDOM 


Before we proceed further, it is time to examine the notion of the degrees of 
freedom. Recall the basic definition of the sample variance S* of a random 


sample y,, y2,-.-, Yn as 


Yo- do@ 


where 
¥= Di yi/n (2.4.2) 
i=1 


and 


pees (2.4.3) 


16 The Analysis of Variance 


From (2.4.1) we note that the quantity S* is based upon a sum of squared devi- 
ations from the sample mean. However, we know that the sum of the deviations 
about the mean, defined by (2.4.3), must be zero; that is, 


3 d; = 0. (2.4.4) 
i=l 


The property (2.4.4) has a very important consequence. For example, suppose 
that in a random sample of n =4, one is to choose all the deviations from the 
mean. For the first deviation, one can guess any number, say, d; = 6. Similarly, 
quite arbitrarily, one can assign two more deviations, say, d2 = —9 and d3 = —7. 
However, when one comes to choose the value of the fourth deviation, one is no 
longer free to take any number one pleases. The value of d4 must be given by 


dy = 0—(d; +a, + d3) 
= 0-(6-—9-—7) 
= 10, 


In short, given the values of n — 1 deviations from the mean, which could be 
any arbitrarily assigned numbers, the value of the last deviation is completely 
determined. Thus, we say that there are n — 1 degrees of freedom for a sam- 
ple variance, reflecting the fact that only n — 1 deviations are “free’’ to be any 
number. Given the value of these “‘free’’ numbers, the last value is automatically 
determined.’ 

To obtain the degrees of freedom associated with different sums of squares, 
we note that since the total sum of squares (SS) is based on an deviations 
yij — ¥..’s of the an observations with one constraint on the deviations, namely, 


a n 


>> > Ou — 9.) =0, (2.4.5) 
1 


i=l j= 


it has an — 1 degrees of freedom. Similarly, since the between group sum of 
squares (SS) is computed from the deviations of the a independent class means 
y;.’s from the overall mean y.., but with one constraint, namely, 


>i. — 5.) = 0, (2.4.6) 
i=] 


hence it has a — 1 degrees of freedom. 
Finally, since the within sum of squares (SSw) involves the deviations of the 
an observations from the a sample means, it has an — a = a(n — 1) degrees 


* For a geometric interpretation of degrees of freedom, see Walker (1940). 


One-Way Classification 17 


of freedom. Alternatively, the degrees of freedom corresponding to the within 
groups can be argued as follows. Consider the component of the within sum of 
squares corresponding to the i-th factor level, namely, 


> Ou — HY. (2.4.7) 
j=l 


The expression (2.4.7) 1s the equivalent of a total sum of squares considering 
only the i-th factor level. Hence, there are n — 1 degrees of freedom associated 
with this sum of squares. Since SSyw is a sum of squares comprising component 
sums of squares, the i-th component being given by (2.4.7), the degrees of free- 
dom associated with SSy is the sum of the a component degrees of freedom, 
namely, 


Y= 1) =a(n— 1). (2.4.8) 


i=] 


2.5 MEAN SQUARES AND THEIR EXPECTATIONS 


The next question is how to use the partition of the total sum of squares and the 
corresponding degrees of freedom in making inferences about the existence of 
treatment effects. In the analysis of variance, ordinarily, it is convenient to deal 
with the quantities known as mean squares instead of sums of squares. The mean 
squares are obtained by dividing each sum of squares by the corresponding 
degrees of freedom. We denote the two mean squares, namely, between and 
within by MSz and MSy, respectively. 

Next, we examine the expectations of within and between mean squares. 
From model equation (2.1.1), we obtain 


yi. = Yu + a; + ej;)/n = Wt a; + &;. (2.5.1) 
= 
and 
¥.= uta +%)/a=p+a, +2... (2.5.2) 


t=1 


Substituting the values of y;;, y;,, and y,, from (2.1.1), (2.5.1), and (2.5.2), 
respectively, into the expressions for SSw and SSz defined in (2.3.7), we find 


SSw = s 3 (e;; — 2.) (2.5.3) 
i=1 j=l 


18 The Analysis of Variance 


and 


a 


SSp =n » (a; -& +8 —2@). (2.5.4) 


i=] 
Now, because the e;;’s are uncorrelated and identically distributed with mean 
zero and variance oc, it follows, using the formulae of the variance of the 


e? 


sampling distribution of e;;’s, that 


E(e;,) = 0;, (2.5.5) 

E(é?) = o, /n, (2.5.6) 
and 

E(é’) = o; Jan. (2.5.7) 


It is then a matter of straightforward simplification to derive the expectations 
of mean squares. First, taking the expectation of (2.5.3), we get 


E(SSw) = YE bs di —2e, eit na 
i=] j=1 j=l 
= 3 bs E(e;;) — ne(@)| (2.5.8) 
i=] Lj=1 _ 


On substituting the values of E (e7,) and E (27) from (2.5.5) and (2.5.6), respect- 
ively, into (2.5.8), we find 


a 2 
= 7 ames a 
E(SSy) = ye (ne: n zs ) 


i=] 


= a(n — 1)o? (2.5.9) 


The expectation of MSy is, therefore, given by 


E(MSy) = (—) re (2.5.10) 
a(n — 1) 


Note that result (2.5.10) is true under both Models I and II. To derive the 
expectation of MSpz, we first consider the case of Model I and then that of 
Model IT. First, 1f we restrict ourselves to Model I, then the a;’s are fixed quan- 
tities depending on the particular levels (treatments) selected in the experiment 
with the restriction that @, = O. Then, on taking the expectation of (2.5.4), we 


One-Way Classification 19 


get 


E(SSz) =n bs ao +tEY (é,— “| (2.5.11) 
i=1 =] 


I 


by virtue of the fact that the a;’s are constants and the expectation of the cross- 
product term vanishes. 
Now, using results (2.5.6) and (2.5.7), we find that 


EG. -2.Y =) E(@) -aE@) 
i=] i=] 


o2 o2 
ere ys en 
n an 
o2 
= (a — 1)—. (2.5.12) 
n 


Furthermore, substituting (2.5.12) into (2.5.11), we obtain 


E(SSg)=n > a} + (a— 1)o?. (2.5.13) 


i=l 


Finally, the expectation of MSz is given by 
n =. 5 
E(MSsz) = 7 di +oa?. (2.5.14) 


Next, consider the case of Model II when the a;’s are also randomly dis- 
tributed with mean zero and variance o7, and the a;’s are uncorrelated with 
each other and with the e;;’s. It then follows, using the formula for the variance 
of the sampling distribution of the mean of the a;’s, that 

E(a?) =o: (2.5.15) 
and 

E(@y =o, /a. (2.5.16) 
Now, on taking the expectation of (2.5.4), we get 


E(SSg) =n le SS (a; -a)/Y +E 2 (é.= | ; (2.5.17) 
i=l i=l 


20 The Analysis of Variance 


Using results (2.5.15) and (2.5.16), we find that 


E Y (ai —ay= > E(a?) —aE@)y’ 
i=l i=l 


2 
= Di a ee 
= ao, a 


=(a—1)o2. (2.5.18) 


On substituting (2.5.12) and (2.5.18) into (2.5.17), we get 


E(SSg) = nf (a —l)oj +(a- pe 
=n(a—1)o2+4+(a— 1)o?. (2.5.19) 
Finally, the expectation of MSz is given by 
E(MSg) = no? + 02. (2.5.20) 


It is important to recognize that in the derivation of the results on expected 
sums of squares and mean squares, we have not made any distribution assump- 
tions for e;;’s under Model I and for e;;’s as well as @;’s under Model II. 
This fact has important implications while procuring unbiased estimates for the 
parameters under both Models I and II. 


2.6 SAMPLING DISTRIBUTION OF MEAN SQUARES 


The between and within sums of squares or mean squares are functions of the 
sample observations, and thus must have a sampling distribution. However, to 
derive the form of their distributions, we require the assumption of normality for 
the random components of model (2.1.1). Thus, we assume that under Model I, 
the e;;’s are completely independent and are normally distributed with mean 
zero and variance a2. Furthermore, under Model II, the a;’s are also completely 
independent of each other and of the e;;’s, and are normally distributed with 
mean zero and variance 0. 


Now, first of all, we note that for a normal parent population with variance 07 


aww ; (2.6.1) 


where &2 is an unbiased estimator of o? based on v degrees of freedom (see 
Appendix C). Furthermore, under both Models I and II, we have from (2.5.10) 
that 


E(MSyw) = o. (2.6.2) 


One-Way Classification 21 


From (2.6.1) and (2.6.2), it follows that 


MSy — x7[a(n — 1], 


o2 a(n — 1) 


(2.6.3) 


that is, the ratio of MSy to o2 is a x*[a(n — 1)] variable divided by a(n — 1). 
To derive the sampling distribution of MSz, however, we have to distinguish 
between the cases of Models I and II. Under Model I, we have from (2.5.14) that 


Ht 5 2 
E(MSz) = oar 2A +o. (2.6.4) 
Therefore, MSg is an unbiased estimator of o2 only when a = 0,i = 1, 
2,...,a; that is, the effects of all the levels are the same. Hence, from (2.6.1) 


and (2.6.4), it follows that when a; =O (i = 1,2...,a) we have 


MSs _ x*[a — 1]. (2.6.5) 


2 = 
0; a—l1 


that is, the ratio of MSz to a? is a x*[a — 1] variable divided by a — 1. It is im- 
portant to note at this point that when the effects of all the levels are not the same, 


MSep x7[a—1,A] 


2 = 
0; a—1 


(2.6.6) 


where x? [a — 1, A] represents a noncentral x7?[a — 1] variable with the non- 
centrality parameter 4 given by 


n a 
a . 2.6.7 


ae) 


The proof of this result is beyond the scope of this volume and is not presented 
here. However, the interested reader is referred to Appendix E for a definition 
of the noncentral chi-square variable. 

Under Model II, we have from (2.5.20) that 


E(MSg) = 02 +noZ. (2.6.8) 


It can further be shown that 


MSz x*la— 1] 


2 + noz eae ; (2.6.9) 


22 The Analysis of Variance 


that is, the ratio of MS z to oa? + no is a x*[a — 1] variable divided bya-—l. 
The result (2.6.9) is true irrespective of whether o7 = 0 or not. A proof of this 
result can be found in Graybill (1961, pp. 344-345; 1976, pp. 609-610) and 
Kempthorne and Folks (1971, pp. 467-470). Furthermore, it can be proven that 
under both Models I and II, the two statistics MSz and MSy are statistically 
independent. In other words, the value of MSyw, whether particularly large or 
small, gives us no information about whether the value of MSz 1s particularly 
large or small. The mathematical proof that MS and MSy are independent is 
rather involved and is not presented here (see, e.g., Graybill (1961, pp. 345- 
346; 1976, pp. 609-610); Scheffé (1959, Chapter 2)). Intuitively, one can argue 
as follows. Since SSz is based solely on the group mean values, it has nothing 
to do with the individual variation within any group. Similarly, SSw is based 
solely on the individual variation within groups (i.e., measured from their 
respective group means) and is, therefore, not affected whatever the group 
means happen to be. 


2.7 TEST OF HYPOTHESIS: THE ANALYSIS 
OF VARIANCE F TEST 


In this section, we present the usual hypothesis about the treatment effects and 
the appropriate F test for fixed and random effects models. 


MODEL | (FIXED EFFECTS) 


Under Model I, the usual null hypothesis of interest is that all the treatments 
(factor levels) have the same effect; that 1s, 


Ho :@,; = =---=a, = 0. (2.7.1) 
The alternative is 
Hy, : not all @;’s are zero. 


In order to develop a test statistic for the hypothesis (2.7.1), we note from 
(2.5.10) and (2.5.14) that when Ap is true 


E(MSw) = 02 (2.7.2) 
and 
E(MSz) = 02; (2.7.3) 


that is, both the mean squares MSw and MSz are unbiased estimates of the 
same quantity ae. On the other hand, when (2.7.1) 1s false, 


E(MS3) > E(MSy). (2.7.4) 


One-Way Classification 23 


Furthermore, since under Hy both of these mean squares divided by oc? are 
independently distributed as chi-square variables divided by their respective 
degrees of freedom, it follows that their ratio is distributed as Snedecor’s F. 
Thus, the ratio 


_ MSg/o; _ MSz 


= ——__* = 2.7.5 
MS wy /o? MSy ( ) 


is distributed as an F variable with a — 1 and a(n — 1) degrees of freedom 
(see Appendix D). Hence, the null hypothesis (2.7.1) can be tested by computing 
the ratio (2.7.5) and comparing it directly with the one-tailed values in the tables 
of the F distribution with a — 1 and a(n — 1) degrees of freedom.? An a-level 
is chosen in advance and if the calculated value is greater than 100(1 — @)th 
percentage point of the F distribution with a— 1 and a(n—1) degrees of freedom, 
we may conclude that the hypothesis is false at the a-level of significance. 

The formal hypothesis testing procedure involving fixed @ as described is 
very useful and has led to many important developments in statistical theory. 
However, an alternative way to test the hypothesis (2.7.1) is in terms of the 
p-value. The p-value for a sample outcome is the probability of obtaining 
a result equal to or more extreme than the observed one. In this case, p = 
P([F[a —1, a(n — 1)] > Fo], where F[a — 1, a(n — 1)] has an F distribution 
with a — 1 and a(n — 1) degrees of freedom and Fp is the observed value of 
the statistic (2.7.5). Larger p-values support Hp and smaller p-values support 
H,. A fixed a-level test can be carried out by comparing the p-value with the 
specified a-level. If the p-value is greater than the specified a, Ho 1s concluded, 
otherwise not.’ For a further discussion of p-value, see Gibbons and Pratt (1975) 
and Pratt and Gibbons (1981, pp. 23-32). 

It should be observed that the F statistic defined in (2.7.5) always provides 
a one-tailed test of Ho in terms of the sampling distribution of F’. This is the 
case since under Hy, 


E(MSg) > E(MSw), 


and thus the F statistic must show a value greater than one. The value of an F 
statistic less than one can signify nothing except the sampling error,°? or perhaps 
nonrandomness of the sample observations, or violation of the assumptions. 


3 The F test considered here is designed for alternatives a; 4 aj’ for some pair (i, i’) in all possible 
directions. For a discussion of monotone alternatives a; < a2 <--- < dg, see Miller (1986, 
Section 3.1.3) and references cited therein. 

4 Hodges and Lehmann (1970, p. 317) suggest that one can consider the p-value as a “measure 
of the degree of surprise’’ which the experimental data should cause in the belief-of the null 
hypothesis. Miller (1986, p. 2) has termed the p-value a “measure of the credibility” of the 
null hypothesis. The smaller the value, the less likely one feels about the veracity of the null 
hypothesis. 

> When the null hypothesis is true, one can expect a value of the F ratio less than one at least 50 
percent of the time. 


24 The Analysis of Variance 


For example, if the measurements are not made in a random order, then any 
uncontrolled factor may have some varying effect on the sequence of experi- 
ments. This could cause an increase in within-group variance but may leave the 
between-group variance unaffected. Thus, if the value of the statistic (2.7.5) is 
obtained as significantly less than one, then it is possible that an important un- 
controlled factor has not been randomized during the course of the experiment 
and much of the usefulness of the experimental results has been invalidated. 


Remarks: (i) It should be noted that 


E(MSz) 
Be ge 


since, in general, the expected value of a ratio of two random variables is not equal to 
the ratio of the expected values of the random variables, even though the latter may be 
equal. Actually, it can be shown that when the null hypothesis is true 


E(F) = a(n — 1) 
~ a(n—1)—-2, 
Thus, under Ho, 
E(MS 
ESB 1, but E(F)> 1. 
E(MSy) 


(ii) The analysis of variance model (2.1.1) may also be looked upon as a way to 
explain the variation in the dependent variable. The ratio of the between-group sum 
of squares to the total sum of squares gives the proportion of the total sum of squares 
accounted for by the linear model being posited and provides a measure of how well the 
model fits the data. A very low value indicates that the model fails to explain a lot of 
variation in the dependent variable and one may want to look for additional factors that 
may help to account for a higher proportion of the variation in the dependent varible. 


MODEL I! (RANDOM EFFECTS) 


Under Model II if all the factor levels have the same effect in the population 
of random effects a;’s, then 0? = 0. Hence, the Model II analogue of the null 
hypothesis (2.7.1) is 

Ho :02 =0, (2.7.6) 
against the alternative 

H,:02 >0. (2.7.7) 
Again, from (2.5.10) and (2.5.20), it follows that when Hp 1s true, 


E(MSyw) =o? (2.7.8) 


One-Way Classification 25 


and 
E(MSz) = 07; (2.7.9) 


that is, both the mean squares MSz and MSy are unbiased for the error vari- 
ance o2. Furthermore, as stated in the preceding section, MS /(o; + noZ) is 
distributed as a chi-square variable divided by a — | and MSz is statistically 
independent of MSy. Thus, the ratio 


M 2 2 Ih = 
MSp/(oe + Noa) _ ( a ) bad: (2.7.10) 


l+n— 
MSw j o2 oa? MSy 
is distributed as an F variable with a — 1 and a(n — 1) degrees of freedom 
and, when /% is true, the statistic MS, /MSy has an F distribution. Hence, the 


same test statistic, as used in the case of Model I, can be employed to test the 
hypothesis (2.7.6). 


Remarks: (i) Sometimes, it is quite likely that the hypothesis (2.7.6) versus (2.7.7) may 
not be a realistic choice. For example, it may be thought that some differences between 
the factor levels (groups) are almost certain to exist, and then it makes little sense to 
test the hypothesis of no difference. However, the researcher may want to see whether 
o2 < o?/2; that is, if the variability between groups is half or less than the variability 


Q@ 


within groups. In other words, a researcher may want to test a hypothesis of the type: 
Hj:02 /o2 <p vs. Hi:0; /o; > po, (2.7.11) 


where p, is a specific value of o2/a?. In this case, the statistic (2.7.10) with 02/07 = p, 
provides the proper F test with large values of the statistic providing a ground for 
rejection. Thus, the test consists of rejecting Ho if Fos > (1 +np,) Fla — 1, a(n — 1); 
1 —a].° 

(ii) One is occasionally interested in testing the hypothesis that the overall mean (2) is 
equal to some given constant z,. The test can be performed by considering the quantity 
MS, = an(y.. — t,)*, which has one degree of freedom. Furthermore, it can be shown 
that 


a2 + an(ie — Mo)’, for Model I 


E(MS,) = ; 
ao? + an(u— po)? +no2Z, for Model II. 


Thus, under Model I, the hypothesis can be based on the ratio MS,/MSy which has 
an F distribution with 1 and a(n — 1) degrees of freedom. While, under Model II, the 
hypothesis can be based on the ratio MS,/MSz which has an F distribution with 1 and 
a — 1 degrees of freedom. It should be noticed that in this example Models I and II have 
led to different significance tests. 


© Spjotvoll (1967) has studied the structure of optimum tests of the hypothesis of the type (2.7.11). 


26 The Analysis of Variance 


TABLE 2.1 
Analysis of Variance for Model (2.1.1) 


Expected Mean Square 
Source of Degreesof Sumof Mean 9 ——————————— 


Variation Freedom Squares Square Model | Model II F Value 
a 
n> a? 
Between a—1 SSB MS, o2+—— — o¢+noz MSs/MSw 
a — 
Within a(n — 1) SSw MSw a2 a? 


Total an— 1 SSr 


2.8 ANALYSIS OF VARIANCE TABLE 


The results on the partition of the total sum of squares, degrees of freedom, expected 
mean squares, and the analysis of variance F test are usually summarized in the form 
of a table commonly referred to as the analysis of variance table. The table shows in a 
certain order the sums of squares and other related quantities used in the computation 
of the F test. Such a table greatly simplifies the arithmetic and algebraic details of the 
analysis, which tend to become rather complicated in more complex designs. 

In an analysis of variance table, the first column, designated as the source of variation, 
represents the partitioning of the total response variation into the various components 
included in the linear model. The second column, designated as degrees of freedom, 
partitions the sample size into various components that relate the amount of information 
corresponding to each source of variation of the model. The third column, designated 
as sum of squares, contains the sums of squares associated with various components 
or sources of variation of the model. The fourth column, designated as mean square, 
lists the respective sums of squares divided by the corresponding degrees of freedom. 
The fifth column, designated as expected mean square, contains expected mean squares 
that represent expected values of the mean squares derived under the assumption of an 
analysis of variance model. The sixth column, designated as F value, contains the values 
of the F ratios which are generally formed by taking ratios of two mean squares. In most 
of the worked examples presented in this volume, we have generally added a seventh 
column, designated as p-value, which contains probabilities of obtaining a result equal 
to or more extreme than the observed F ratios if the null hypothesis were true. 

Table 2.1 shows the general form of the analysis of variance table for the one-way 
classification model (2.1.1). 


2.9 POINT ESTIMATION: ESTIMATION OF TREATMENT 
EFFECTS AND VARIANCE COMPONENTS 


It should be recognized that performance of the F test and construction of 
the analysis of variance table by no means complete all the inferences the 
investigator may want to draw. The experimenter’s main objective is not always 


One-Way Classification 27 


to test the equality of all the treatment means. In many instances, he may want to 
estimate various parameters or functions of parameters. Under fixed effects, the 
parameters of the model (2.1.1) are w, a1, ..., @, and oa? which can be estimated 
from the sample data. The random effects version of the model (2.1.1) involves 
the parameters fz, 02, and o2 which also can be estimated. 

As estimators of the parameters, we consider the best linear and best quadratic 
unbiased estimators. Under Model I, we have 


E(yij) = w+ a. (2.9.1) 
Then it is straightforward to see that 


_ (etait + +m) 


E(¥.) 
n 
=pu+a; (2.9.2) 
and 
Ej.) = (Uta) Ft) +--+ U+Ag) +--+ (Uh + Og) 
V)= ee 
= fl; (2.9.3) 
since generally the a@;’s are chosen such that }°°_, a; = 0. From (2.9.2) and 


(2.9.3), the unbiased estimates of jz and a; are 

f= y.. (2.9.4) 
and 

Qi = Vi. —Y... (2.9.5) 


It can be shown that (2.9.4) and (2.9.5) are the so-called best linear unbiased 
estimates (BLUE) for and a@;, respectively. With the additional assumption 
of normality they are the best unbiased estimates. Furthermore, from (2.5.10), 
it follows that MSy is an unbiased estimate for a. In addition, it can be 
shown that MSy is the best quadratic unbiased estimate of o? and if the 
e;;'S are normally distributed, MSy is the best unbiased estimate (see Graybill 
(1954)). 

Under Model II, instead of estimating the effects directly by taking differ- 
ences of the treatment means from the grand mean as in the case of Model I, the 
problem of major interest is the estimation of the components of variance o2 
and a. One set of estimators of a2 and c? are immediately obtained using the 
standard method of moment estimation based on the expected mean squares 
appearing in the analysis of variance table. Thus, from Table 2.1, we obtain 


28 The Analysis of Variance 


that 
E(MSw) = 02 (2.9.6) 
and 
E(MSg) = o; +noz. (2.9.7) 
Hence, it follows that 
62 = MSw (2.9.8) 
and 
6? = (MSz — MSyw)/n (2.9.9) 


are unbiased estimators of a7 and a7, respectively. 


Remarks: (i) It can be shown that the estimators (2.9.8) and (2.9.9) are also the max- 
imum likelihood estimators’ (corrected for bias) of the corresponding parameters (see, 
e.g., Graybill (1961, pp. 338—344)). Furthermore, it can be proven that in the class of all 
quadratic unbiased estimators (quadratic functions of the observations), the estimators 
(2.9.8) and (2.9.9) have minimum variance (see Graybill and Hultquist (1961); Graybill 
(1976, pp. 614-615)). 

(ii) The aforesaid property of the “minimum variance quadratic unbiased estimation’ 
of the above estimators of variance components holds irrespective of the form of the dis- 
tribution of the random effects @;’s and e;;’s. Moreover, with the additional assumption 
of normality, the estimators (2.9.8) and (2.9.9) can be shown to have minimum variance 
within the class of all unbiased estimators (see Graybill (1954); Graybill and Wortham 
(1956)). Thus, the estimators (2.9.8) and (2.9.9) have certain optimal properties.’ Nev- 
ertheless, the estimate of of can be negative. 

(iii) It is clearly embarrassing to estimate a variance component as a negative number, 
which, by definition, is a nonnegative quantity. Several courses of action, however, 
are available. One procedure is to accept the negative estimate as an indication that 
the true population value of the variance component is close to zero, assuming of course 
that the sampling variability produced a negative estimate. This seems to have some 
intuitive appeal but the replacement of a negative estimate by zero affects some statistical 
properties such as unbiasedness of the estimates. Alternatively, one may reestimate the 
variance components using methods of estimation that always produce nonnegative 
estimates. Still, another alternative is to interpret the negative estimate as an indication 
that the assumed model is wrong. A full discussion of these results is beyond the scope of 


’ For the nonnegative maximum likelihood estimators, see Sahai and Thompson (1973). 
8 There are biased estimators that have more desirable properties in terms of the mean squared 
error criterion. 


One-Way Classification 29 


this volume. The interested reader is referred to the survey papers by Harville (1969, 
1977), Searle (1971a), and Khuri and Sahai (1985) and books by Searle (1971b) and 
Searle et al. (1992) which contain ample discussions of this and other related topics. 

Knowing the estimates of a2 and Ge. the estimate of the total variance oy 1S 
obtained as 


6° = 674 6%. (2.9.10) 
The fact that the total variance consists of 02 and o2 permits one to make a 
somewhat more informative use of the estimates of 07 and a7. We can take the 
ratio of the estimated o; to the estimated total variance (6*) to find the estimated 
proportion of variance accounted for by the factor levels. It is highly informative 
to estimate variance components individually and to employ them in evaluating 
proportions of variance accounted for by different factors. Such proportions give 
one of the best ways to decide if a factor is a predictably important one. For 
example, it is entirely possible for a given factor to give statistically significant 
results in a study, even though only a very small percentage of variance is 
attributable to that factor. This is most likely to happen, of course, if the sample 
nis very large. On the other hand, when there is significant evidence for effects 
of a factor and the factor also accounts for a relatively large percentage of 
variance, then this information may be an important instrument in interpreting 
the experimental results or in deciding how the experimental findings might be 
applied. Thus, in Model II experiments, when the levels of a factor are sampled, 
it is a good practice to estimate the components of variance, and to judge the 
significance of the factor on the basis of the explained variation in addition to 
the results of the F test. 

In many areas of research, particularly in genetics and industrial work, in- 
terest may center on the estimation of the variance ratio? 9 = o2/o7, or the 
intraclass correlation defined by p = 02/(o7 + a2) (see Appendix M for a defi- 
nition of the intraclass correlation). It can be shown that the uniformly minimum 
variance unbiased (UMVU) estimator of 0 is given by (see, e.g., Graybill (1961, 
pp. 378-379); Winer et al. (1991, p. 97)). 


MSy a(n — 1) 
[a(n — 1) — 2] MSp — a(n — 1)MSy 


P | a(n—1)—2 | 


n 


an(n — 1)MSywy 
MSs — mMS 
oe (2.9.11) 
mnMSy 


° The ratio9 = o2/o2 measures the size of the population variability relative to the error variability 
present in the data. 


30 The Analysis of Variance 


where 


By way of contrast 


mo, 

MS; — MS 
at prea Jee (2.9.12) 
6? nMSy 


To get an idea of the order of the magnitude of the bias in the estimator 
(2.9.12), consider the following data (Winer et al. 1991, pp. 97-98): 


a=4. .n-= 10, 
MS, = 250, MSy = S50. 


For this example, 
A 2 “~ 2 I 
6,=50 and 6,= Tia — 50) = 20. 


Hence, 


62 20 
Oe 


50 


Whereas, from (2.9.11), we have 


36 
_ 250 — —(50) 
6 = —__ 34 9 377. 


360 
( - oo 
Thus, the estimator (2.9.12) is slightly positively biased in relation to the esti- 
mator (2.9.11). 
The UMVU estimator of the intraclass correlation cannot be expressed in 
closed form (Olkin and Pratt 1958). A computer program to calculate the 
UMVU estimator is given by Donoghue and Collins (1990). A biased esti- 


mator, however, can be obtained by substituting the estimates of the individual 
components in the formula for the intraclass correlation. Thus, 


6? MS3 — MSy 


ae (2.9.13) 
62462  MSg+(n—1)MSy 


One-Way Classification 31 


The estimator (2.9.13), however, can produce a negative estimate. Alternatively, 
an estimate of the intraclass correlation in terms of the unbiased estimate of 0 is 


6 
ey 2.9.14 
p Re: ( ) 


For the numerical example in the preceding paragraph, the estimator (2.9.13) 
gives 


20 
50 + 20 


re = 0.286, 


whereas the alternative estimator (2.9.14) in terms of the estimator of @ yields 


0.372 
~~ 1 +0.372 


Al 


p = i027 |. 


Hence, the latter estimator (6’) slightly underestimates p in relation to the 
estimate provided by #. For a review of other estimators of p, see Donner 
(1986). 


2.10 CONFIDENCE INTERVALS 
FOR VARIANCE COMPONENTS 


An examination of the mean square and the expected mean square columns 
of the analysis of variance Table 2.1 suggests which quantities can be readily 
estimated by aconfidence interval. First, note that each of the entries of the mean 
square column can be transformed into a chi-square variable by multiplying it by 
the corresponding degrees of freedom and then dividing by the corresponding 
expected mean square value. This chi-square variable can then be used to obtain 
a confidence interval for the quantity appearing in the expected mean squares 
column. Thus, a 100(1 — @) percent confidence interval for a2 can be obtained 
by noting that 


oe ~ x?[a(n — 1)]. (2.10.1) 


e 


The desired confidence interval is then given by 


a(n — 1)MSy a) a(n = 1)MSy 


<r << — — _| = 1-2, (2.10.2) 
X7Ja(n—1),1-a@/2] * — x7[a(n — el 


where y7[a(n — 1), 1 —a@/2]and x?[a(n — 1), w/2] denote the 100(1 — a/2)th 


32 The Analysis of Variance 


and 100(@/2)th percentage points of the chi-square distribution with a(n — 1) 
degrees of freedom.'° Similarly, a 100(1 — @) percent confidence interval for 
o? +no2 can be obtained by noting that 


(a — 1)MSpz 


ae x7[a — 1]. (2.10.3) 


The desired confidence interval is then given by 


— 1)MSz, ~ 
<o) +no2 < S ) MSs |=1-4 (2.10.4) 


(a — 1)MSp 
‘i x7[a —1, a/2] 


x72[a — 1,1 —a@/2] 


where again x*[a—1, 1—a@/2] and x*[a —1, a /2] represent the 100(1 — 
a/2)th and 100(@/2)th percentage points of the chi-square distribution with 
(a — 1) degrees of freedom. 

Unfortunately, the expressions in the expected mean square column are the 
only quantities for which one can obtain exact confidence intervals by the pro- 
cedure just described. In particular, an exact confidence interval for o2 does not 
exist. Various approximate confidence intervals have been proposed in the liter- 
ature. For a detailed discussion of these procedures and their relative merits, the 
reader is referred to Boardman (1974) and Burdick and Graybill (1988, 1992, 
pp. 60-63). Here, we briefly describe some procedures that have been recom- 
mended for the problem. A conservative 100(1 — 2a) percent confidence inter- 
val based on the distributions of MSz and MS, /MSy 1s obtained as (see, e.g., 
Williams (1962); Graybill (1976, pp. 618—620)): 


P| (a — 1)MSp \ Fla — 1, a(n — 1); aT) 
nx*[a—1,1—a/2] ( 7 Fe 
; (a — 1)MSz (1 7 Fla —1, a(n — 1); ey | Siz oy 


<0 


«= nx*[a — 1,a/2] | da 


(2.10.5) 


where F* =MSz/MSy. The empirical evidence seems to indicate that the 
probability 1 — 2@ in (2.10.5) can be replaced by 1 — a. Similarly, two approx- 
imate 100(1 — @) percent confidence intervals based on the distribution of the 
ratio of mean squares are obtained as (see, e.g., Bulmer (1957); Scheffé (1959, 


10 A slightly shorter confidence interval could be obtained by considering unequal probabilities in 
each tail. Tate and Klett (1959) and Murdock and Williford (1977) provide tables of chi-square 
values that provide shortest two-sided intervals. 


One-Way Classification 33 


p. 235); Searle (1971b, p. 414))."!: 


MS MS 
Pa ee era aoe ta) 
nF[a—1,0;1—a/2] \MSy 
MS MS 
ds fe —~* _ Fla—1,a(n — 1); o/2] =l-a. 
nF[a—1, co; a/2] \MSy 


(2.10.6) 


and 
: MSw(F* — Fo)(F* + Fo — F5) 
nF* F, 
- MSyw(F* — F\)(F* + Fi — Fi) 


<o =1l-a. (2.10.7) 
nk* F, 


where F* =MS2/MSy, F; = Fla—1, a(n—1);a@/2], Fo = Fla—1, a(n—1); 
1—a@/2), Fj = Fla—1, «;a@/2], and F, = F[a — 1, 00, 1 —a@/2]. 

Finally, an approximate procedure that seems to provide a shorter interval 
and has better coverage property is given by (Ting et al. (1990); Burdick and 
Graybill (1992, pp. 60-61)): 


Pp} (MS. —~MSy — JV) <o2 < ~ (MS —MSy + Vo) }=1 — a, 
(2.10.8) 


where 


V, = G?MS%, + H2MS%, + Gi2MSeMSy, 
Vy = H?7MS?, + G3MSi, + Hi2MSeMSy, 


with 


G,=1-—F'[a-—1,0;1-—a/2], G.=1-F '[a(n—1), 00;1-—a/2], 
H, = F~'[a-—1,o;a/2]—1, Ay = F7'[a(n — 1), 0; a/2]—1, 


_ (Fla — 1, a — 131 ~ @/2) — 1) — Gi Fla — 1, a(n — 131 — @/2) — Hy 


Gi2 
Fla — 1,a(n — 1);1—-—a@/2] 


b 


'! A confidence interval for o2 that is robust to departures from normality can be obtained using 
the jackknife technique (Arvesen and Schmitz (1970); Miller (1986, Section 3.6.3)). 


34 The Analysis of Variance 


and 


es (1 — Fla — 1, a(n — 1);@/2])* — H?F?[a — 1, a(n — 1);a/2] — G3 
nn Fla — 1, a(n — 1);a/2] 
Although we have only approximate procedures for confidence limits for a7, 
we are able to obtain exact confidence limits for the ratio o2/a7, the intraclass 
correlation o2/(a7 + a2), and o2/(a2 + a2). Using the distribution result in 
(2.7.10); that is, 


ieee ~MSp Fa noe (2.10.9) 
n—& ~ Fla — 1, a(n — 1)], 10. 
a? MSy 


it can be shown that the probability is 1 — @ that 


1 {| MS, | 1 
n\{MSy Fl[a-—1,a(n —1); 1—a/2] 
a? 1 fs 1 


a et if (2.10.10) 


o2 n\MSw Fla—1,a(n—1); @/2] 


Furthermore, on rearranging the inequalities in (2.10.10), it readily follows that 
with probability 1 — a, the following relations hold: 


F* — F[a—1,a(n — 1); 1 —a@/2] 
F* +(n—1)Fl[a — 1, a(n — 1); 1 —a@/2] 
a? F* — Fla — 1, a(n — 1); @/2] 


eS. (2.10.11) 
o-+o2 F*+(n—1)F[a—1,a(n — 1); @/2] 
and 
nF[a — 1, a(n — 1); a/2] 
F*+(n—1)F[a — 1,a(n — 1); a/2] 
2 a 2s 
0; nF[a —1,a(n — 1);1 —a@/2] (2.10.12) 


<< 7 
o2-+o2 F*r+(n—1)Fla—1,a(n—1); 1 -—a@/2] 


where F* = MSz/MSy. The inequalities (2.10.11) and (2.10.12) provide exact 
confidence limits for the intraclass correlation (p) and 1 — p, respectively. Since 
p = O, negative limits are defined to be zero. 


Remark: Singhal (1987) and Groggel et al. (1988) provide methods for determining 
approximate confidence intervals for o under the assumption of nonnormality. 


One-Way Classification 35 


2.11 COMPUTATIONAL FORMULAE AND PROCEDURE 


For performing analysis of variance, packaged computer programs are widely 
available for handling calculations that would have been highly tedious or 
simply not feasible in the precomputer age. It is assumed that computer soft- 
ware is used in the handling of analysis of variance computations for all but the 
simplest data sets. For hand calculations, however, the definitional formulae 
for SS7, SSzg, and SSwy given in Section 2.3 are usually not very convenient. In 
the following we give useful computational formulae, which are algebraically 
identical to the definitional formulae. Thus, 


Total Sum of Squares (SS;) = 2 ys On= y.)° 


i=1 j=1 
2 


=> - =, (2.11.1) 
i=1 j=l 


Between Group Sum of Squares (SSz) = n Yi. ~j¥.) 


i=l 


1 a 2 
yy yee. (2.11.2) 
ar an 
and 
Within Group Sum of Squares (SSw) = > (Vij — yi)? 
i=l j=l 


=)°y' y- woes (2.11.3) 


i=1 j=l 


It should be noted that the within-group sum of squares (SSy) can also be 
computed by making use of the identity (2.3.7), giving 


SSw = SS7 — SSz. (2.11.4) 


The relation (2.11.4) can further be used to check the validity of the earlier 
computations. 

Now, the steps in the analysis of variance computations can be summarized 
as follows: 


(1) Sum the observations for each level to form y; for all i, and then obtain 
the grand total y.. 
(ii) Form the sum of the squares of the individual observations to yield 


ae i Vis 


36 The Analysis of Variance 


(111) Form the sum of squares of the totals for each level and divide it by n 
to yield )_, yj /n 
(iv) Square the grand total and divide it by an to yield a y*/an term, which 
is known as the correction factor. 
(v) The three sums of squares can now be obtained by using the computa- 
tional formulae (2.11.1) through (2.11.3). 


Remark: The computational formulae given in this section are very convenient to use. 
But a word of caution must be included for individuals who use a computer or an 
electronic calculator with an eight digit capacity. If we sum the squares of numbers 
containing three or more digits, we can exceed their capacity easily and thereby get 
erroneous results. In that case it would be better to use their definitional formulae to 
calculate the appropriate sums of squares. 


2.12 ANALYSIS OF VARIANCE FOR UNEQUAL 
NUMBERS OF OBSERVATIONS 


Equal numbers of observations for each treatment or at each factor level are 
desirable because of the simplicity of organizing the experiment and subsequent 
data analysis. Furthermore, for a given sample size, the analysis of variance pro- 
cedure is most powerful; that is, 1t provides the smallest value of the probability 
for committing a type II error, when the number of observations for each level 
of a factor is the same. In addition, it has been found that the F test is relatively 
insensitive to the violation of the assumption of the homogeneity of variances 
when the samples are of equal size. However, due to a variety of reasons, it 
may happen that it is impossible to collect an equal number of observations at 
each level of the factor. Part of the data may have been lost, or certain treatment 
or factor levels, which are important for some other reasons, may have been 
emphasized by taking more observations at these levels. Thus, if more data are 
available at some levels than at others, we must take them all into the analysis. 

The analysis of variance for the one-way classification model with unequal 
numbers of observations is essentially the same as for the balanced case. One 
needs to make only minor changes in the formulae to account for unequal 
sample sizes. To avoid unnecessary repetition, we simply indicate the necessary 
changes in notation and present a summary table of the analysis of variance. 
Thus, suppose that the factor has a different levels and thatn; (i = 1,2,..., a) 
observations have been made at each level, giving a total of N = )-"_j.n; 
observations in all. The analysis of variance models and their assumptions 
remain the same except that under Model I in place of )“"_, a; = 0, we now 
need the restriction that ae n;a; = 0. Also, remember that the class mean y;,. 
is now based on n; observations. With this basic change, an identical analysis 
can be carried out as in the balanced case without any conceptual difficulty. The 
details of the analysis are summarized in the form of an analysis of variance table 
as given in Table 2.2. The derivation of expected between mean squares under 
Models I and II is somewhat involved and can be found in Graybill (1961, pp. 


One-Way Classification 37 


TABLE 2.2 


Analysis of Variance for Model (2.1.1) with Unequal Sample Sizes 
Expected Mean Square” 

Source of Degreesof Sumof! Mean ————————— 

Variation Freedom Squares Square Model | Model Il F Value 


a 
nia? 
Between a—1 SSp. 2 MSg” oo 2 + = — 92 +noog MSp/MSw 
a — 
Within N-a SSw MSw o2 oa? 


Total N-1 SST 


| The sums of squares in this case are defined as follows: 


a a nj a nj 
SS3 = So ni Fi. —jy.), SSw= > Si — j¥,)*, and SS;= 3 Sis _ 5), 
i=] i=l g=1 i=l j=l 
with 
nj a nj a 
Jij Yij So nidi 
j=l d 3 i=l] j=l i=l 
— an oe 
f nj a N N 


2 ny = (N? — )~4_, n?)/N(a — 1). If the number of observations is the same for each group, 
that is, n) = n> = --- = 7g =N, then it follows that ny = (a*n? — an*)/an(a — 1) =n. 
Thus, the results of the analysis of variance for the unbalanced design reduce to that for the 
balanced case. 


351-354; 1976, pp. 517-518). The analysis of variance F test and estimation 
of variance components can be accomplished as before. For example, unbiased 
estimators of oa? and a2 are given by 


6° = MSy (2.12.1) 
and 
6° = (MSz — MSy)/no, (2.12.2) 


where n, is defined following Table 2.2. The estimator (2.12.1) is still the 
minimum variance unbiased for chee However, the estimator (2.12.2) 1s not the 
best estimator of a? especially when the n;’s differ greatly among them.” 

It should be remarked that in the case of the unbalanced design, the between- 
mean square (MSz) does not have a chi-distribution when a. > O, but instead 
a weighted combination of chi-square distributions. Under the null hypothesis 


12 For some further discussions on this point, see Robertson (1962) and Kendall et al. (1983, Section 
36.26). For some alternative estimators of variance components for an unbalanced design see 
Searle (1971b, Chapter 10). 


38 The Analysis of Variance 


Hy :o2 = 0, the statistic MSz /MSy has an F distribution with a — 1 and N —a 
degrees of freedom, and can be used to test the corresponding null hypothesis; 
but its distribution under the alternative (a2 > Q) is much more complicated 
than the corresponding balanced case. For some additional results on this topic 
see Singh (1987) and Donner and Koval (1989). Similarly, the normal theory 
confidence interval for 0? can be obtained in the usual way but the determination 
of confidence intervals for 02, 02/02, and 02 /(o27 + 02) is much more com- 
plicated. The interested reader is referred to the book by Burdick and Graybill 
(1992, pp. 68-77) for a concise discussion of methods of constructing confi- 
dence intervals in the case of unbalanced design.!? In passing, we may note that 
an exact 100(1 —a@) percent confidence interval for ¢2 and approximate 100(1 — 
2 490 


a@) percent confidence intervals for 07, 0, /o?, and the intraclass correlation 


p= oa? / (a2 + 02), based on the distribution of the mean squares, are given by 


N —a)MS N —a)MS 
P og a < ee l—a, (2.12.3) 
x7[N — a, ] —a/2] x7LN — a, a /2] 

LMS*, ) UMS, 
a 
(1+n*L) F[a — 1, 00;1 —a@/2] “— (1+n*U) Fla — 1, 00; 0/2] 

ar. (2.12.4) 
MS*, 
n*MSwFla—1,N —a;1—a/2] nmin 
% MS* ] 
a eS a, 00995) 
o2 = =n*MSwFl[a—1, N—a;a/2] max 


and 


F*/F[a—-—1,N —a;1—a/2]-1 
F*/Fla —1,N —a;1—a/2]+(n, — 1) 
F*/Fla—1, N—a;a/2]-1 
iP Sy rea eee ara earl a 


< 
— o2+o02 — F*/Fla—1,N —a; a@/2]+ (n, — 1) 


(2.12.6) 


where 


N= ) ne 
i=l a 


* 
= 
a 1/n; 
i=] 


he No 
N(a—1) 


13 Methods for constructing confidence intervals for 02, 02/07, 02 +62, and a7 /(o7 +2) have 
been discussed by Thomas and Hultquist (1978), Burdick and Graybill (1984), Burdick and 
Eickman (1986), Burdick et al. (1986b), and Donner and Wells (1986). 


One-Way Classification 39 


TABLE 2.3 
Data on Yields of Four Different Varieties 
of Wheat (in bushels per acre) 


Varieties 
| il Hl IV 
96 93 60 76 
37 81 54 89 
58 719 78 88 
69 101 56 84 
73 96 61 75 
81 102 69 68 
a — 2 
F* MSp Ms* yy (ye — y*) 
’ B oO 2 
MSyw = oo 
* 
UMS 
n*MSy Fla—1,N —a;1—a/2] — mmin’ 
* 
~ OMSp 
n*MSw Fla > 1,N —a;a/2] nae 
Amin = min(n;, nN, ED | Ng); Nmax = max(n, no, i Ng), 


with 
nj a 
y=) yij/n; and y* = ya 
jal i=] 


2.13 WORKED EXAMPLES FOR MODEL I 


Suppose a crop scientist wishes to test the effect of four varieties of wheat on 
the resultant yield. She designs an experiment with 24 plots of the same size 
and shape and sows each variety at random in 6 of the 24 plots. The yields from 
these 24 plots provide the data for a one-way classification with equal sample 
sizes and are presented in Table 2.3. 

The data from the experiment just described must be analyzed under Model I 
since the four varieties are specially selected by the experimenter to be of 
particular interest to her. Hence, the factor under investigation (varieties of 
wheat) will have a fixed effect. In this example, a = 4, n = 6, and the resultant 
calculations for the sums of squares, using the computational formulae given 
in Section 2.11, are summarized in the following. 


40 The Analysis of Variance 


The marginal totals corresponding to the different varieties are 
yi, =414, yo, =552, 3, = 378, ya, = 480; 
and the grand total is 
y,, = 1,824. 
The other quantities needed in the calculations of the sums of squares are: 


y> _ (1,824)? 


an 


= 138,624, 


[eee 1 
-)\ oy = a1(414)" + (552)? + (378)* + (480)"] = 141,564, 
ut 1=1 
and 
a Ht 2 eos 2 9) 9) 2 = 
S> S2 y2 = 06) + 37)? +--+ (75) + (68)? = 144,836. 
i=1 j=l 


The resultant sums of squares are, therefore, obtained as follows. 


SS; = 3 ae yz, — = = 144,836 — 138,624 
— 6, aA 

SSp = so 2 _ == = 141,564 — 138,624 
a en 


and 


SSy = yy Yi-n 2De: y? = 144,836 — 141,564 


rl j= 
= 35272. 


Finally, the results of the analysis of variance calculations are summarized 
in Table 2.4. If we choose the significance level of a = 0.05, we find from 
Appendix Table V that the 95th percentage point of the F distribution with 3 and 
20 degrees of freedom is 3.10. Since the value of the F statistic from Table 2.4 is 
5.99, which is greater than 3.10 (p = 0.004), we may conclude that the effects 
of the four varieties of wheat are significantly different. Stated another way, the 
response variability attributable to the means of varieties is significantly greater 


One-Way Classification 41 


TABLE 2.4 
Analysis of Variance for the Yields Data of Table 2.3 
Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square F Value _p-Value 
6 4 
Between 3 2,940 980.000 «7 + —— yap 5.99 0.004 
Varieties = 
Within 20 3,272 163.600 o? 
Varieties 
Total 23 6,212 
TABLE 2.5 


Data on Blood Analysis of Animals Injected 
With Five Drugs 


Drugs 
A B Cc D E 
19.0 7.0 4.0 6.0 6.0 
11.0 1.0 7.0 6.0 4.0 
15.0 4.0 7.0 6.0 2.0 
4.0 10.0 


than the variability due to uncontrolled experimental error. The conclusion of 
the F test from Table 2.4 may not have surprised the agricultural investigator. In 
the first place, she conducted the study because she expected the four varieties 
of wheat to have different effects on yield and was interested in finding which 
varieties lead to higher yield. We discuss this problem, namely, how to study 
the nature of the factor level effects when differences exist, in Section 2.19. 

For another example, involving unequal sample sizes, suppose a pharma- 
ceutical research company conducts an experiment to compare efficacy of five 
drugs. There are 20 animals available for the trial and each drug is injected into 
4 randomly selected animals. Three animals die during the course of the experi- 
ment. The blood samples from the remaining animals are taken and analyzed. 
The data on blood pH reading from each blood analysis in certain standardized 
units are presented in Table 2.5. 

The data of Table 2.5 should be analyzed again under Model I since the five 
drugs are specially chosen by the company to be of particular interest. Hence, 
the factor under investigation (drugs) will have a systematic effect. In this 
example, a = 5,n; = 3,n2. = 4,n3 =3,n4 = 4, andns = 3; and the resultant 
calculations for the sums of squares are summarized in the following. 


42 The Analysis of Variance 


The marginal totals corresponding to the five different drugs are 
y;, = 45.0, yo, = 16.0, y3 = 18.0, ys, = 28.0, ys, = 12.0; 
and the grand total is 
y, = 119.0. 


The other quantities needed in the calculations of the sums of squares are 
obtained as 


2 119) 
y. UPN _ 933 000, 


4 2 2 2 9) 2 12 2 
_ 4)", C6) ze (18) (28) a (12)" 1.091.000, 
nj 3 4 3 4 3 


and 
a ni 
2 2 2 Pia. 
>> >) y2 = (197 + (11)? +++ + (2)? = 1,167.000. 
i=1 j=l 
The resulting sums of squares are, therefore, given as 


a nj 2 
= a ae 
SS; = ) ) , Vij ~ = 1,167.000 — 833.000 
i=l] j= 


= 334.000, 
a 2 ye 
SSz = — — Wy = 1,091.000 — 833.000 
t=] 
= 258.000, 


and 


a nj a 2 
2 Oe, 
SSw = s ) ) va - = 1,167.000 — 1,091.000 
(= jJ= 


(=i? 


= 76.000. 


Finally, the results of the analysis of variance calculations are summarized 
in Table 2.6. If we choose the level of significance a = 0.05, we find from 
Appendix Table V that the 95th percentage point of the F distribution with 4 
and 12 degrees of freedom is 3.26(p < 0.001). Since the value of the F statistic 
from Table 2.6 is 10.18, which is greater than 3.26, we may conclude that the 


One-Way Classification 43 


TABLE 2.6 
Analysis of Variance for the Blood Analysis Data of Table 2.5 
Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square F Value _p-Value 
1 5 
Between 4 258.000 64.500 of + —— Yoni? §=:10.18 ~~ <0.001 
Drugs i 
Within 12 76.000 6.333 a? 
Drugs 
Total 16 334.000 
TABLE 2.7 
Interview Ratings by Five Staff Members 
Staff Members 
l il Hl IV V 
86 67 57 83 85 
75 86 74 80 84 
94 90 7] 96 92 


86 76 55 99 91 


different drugs do not lead to the same mean response; that is, there 1s a relation 
between the injected drug and the pH reading from the blood analysis. 

This conclusion may not have surprised the drug company. In the first place, 
it conducted the study because it was suspected that five drugs would have 
different reactions and the company was interested in finding out the nature of 
these differences. In Section 2.19, we discuss the second stage of the analysis, 
namely, how to study the nature of the factor level effects when differences exist. 


2.14 WORKED EXAMPLES FOR MODEL II 


Suppose a college admissions office wishes to study the results of the interview 
ratings of prospective students by its staff members. Five staff members are 
selected at random and four prospective students are assigned randomly to each. 
The results provide the data for a one-way classification with equal sample sizes 
and are given in Table 2.7. 

The data of Table 2.7 should be analyzed under Model II since the five staff 
members are randomly selected from the list of college staff and the results 
of the analysis are to be valid for the entire pool of college staff. Hence, the 
factor under study should be regarded as having random effects. Here, a = 5 


44 The Analysis of Variance 


and n = 4 and the resultant calculations for the sums of squares using the 
computational formulae are given in the following. 
The marginal totals for ratings corresponding to the five staff members are 


ype 34le “ye ee S19. yg 257, ge Sy. Vs = 02; 
and the grand total 1s 


y. = 1,627. 


The other quantities needed in the calculations of the sums of squares are 
obtained as 


y? a 627) 


an 


= 132,356.450, 


I, 1 2 2 2 2 2 
— ) > 9}, = F1G41? + G19)? + 257)? + (358)? + B52)"] = 134,039.750, 
pet 

and 


»y y2, = (86)? + (75) +++» + (92)? + (91)? = 135,137. 


i=] J= 


The resultant sums of squares are, therefore, given as 


SSp = yy y?, — == = 135,137 — 132,356.450 
i=1 j= 
= 2,780.550, 
SSz = pie 2 _ == = 134,039.750 — 132,356.450 
as 1.683.300. 


and 


SSy = yyy = 7 ye = 135,137 — 134,039.750 
i=l] j=1 


= 1097.250. 


Finally, the results of the analysis of variance calculations are summarized 
in Table 2.8. Here, the null hypothesis states that the variability in interview 
rating among staff members is due entirely to the natural variability among 
students, whereas the alternative hypothesis states that there 1s an additional 


One-Way Classification 45 


TABLE 2.8 

Analysis of Variance for the Interview Ratings Data of Table 2.7 

Source of Degrees of Sum of Mean Expected 

Variation Freedom Squares Square Mean Square FValue _ p-Value 

Between 4 1,683.300 420.825 a2 +4o0? 5.753 0.005 
Staff 

Within 15 1,097.250 73.150 oa? 
Staff 

Total 19 2,780.550 


variability among staff members due primarily to differences in staff members’ 
rating schemes. If we choose the level of significance w = 0.05, we find from 
Appendix Table V that the 95th percentage point of the F distribution with 4 
and 15 degrees of freedom is 3.06. Since the computed F value of 5.753 from 
Table 2.8 is greater than 3.06 (p = 0.005), we may conclude that o7 > 0, or 
that the mean ratings of the staff members differ significantly. 

Furthermore, if the experimenter is interested in estimating the magnitudes 
of the components of variance o7 and o2, we may obtain their unbiased esti- 
mates using the formulae (2.9.8) and (2.9.9). Hence, from (2.9.8) and (2.9.9), 
we find that 


6? = 73.150 
and 


-» 420.825 — 73.150 
ee 


: = 86.919. 
4 


The estimate of the total variance ao? is then given by 


6? = 62+ 62 = 73.150 + 86.919 
= 160.069, 
and the estimated proportion of the total variance accounted for by the staff 
members 1s 


86.919 
~ 160.069 


lo) 


0.543. 


ie) 


2 
a 
2 
y 


Thus, we observe that about 54 percent of the variance among interview rat- 
ings seems to be due to differences among staff members. This would be a most 
important finding in such an experiment, as it would suggest that repetitions 
of this experiment involving different staff members would not be comparable. 


46 The Analysis of Variance 


A change in experimental procedure or some better control over staff ratings 
would clearly be advisable. 
To obtain a 95 percent confidence interval for ak we have 


MS = 73.150, x2[15, 0.025] = 6.262, and y2[15, 0.975] = 27.488. 


Substituting these values in (2.10.2), the desired 95 percent confidence interval 
for o2 is given by 


1 | 7 
5x 73.150) _ 15 x 73.150] _ 9 9 
27.488 : 6.262 


Or 
P[39.917 < of < 175.224] = 0.95. 


Similarly, to obtain a 95 percent conservative confidence interval for o2 from 
(2.10.5), we have 


x°[4, 0.025] = 0.484, y7[4, 0.975] = 11.143, 
F[4, 15; 0.025] =0.116, and F[4, 15; 0.975] = 3.804. 


Substituting appropriate values in (2.10.5), the desired 95 percent confidence 
interval for o2 is given by 


4 x 420.825 3.804 > 4x 420.825 0.116 
P| ————_ [ 1 - ——— ] < oF < — ——[ 1 - —— ]] = 0.95 
4x 11.143 5.753 ° 4 x 0.484 5.753 


Or 
P[12.794 < of < 851.942] > 0.95. 


Furthermore, to obtain a 95 percent approximate confidence interval for a2 
from (2.10.6) and (2.10.7), we have 


F[4, co; 0.025] =0.121, and F[4, co; 0.975] = 2.790. 


Substituting appropriate values in (2.10.6), the desired 95 percent confidence 
interval for a2 is given by 


73.150 


73.150 
< I 
4x 0.121 


A (5,753 = 3,804 
ax2700° > et 


(5.753 — 0.1 16) = 0.95 


Or 


P[12.775 <o% < 851.956] =0.95. 


One-Way Classification 47 


Similarly, substituting the appropriate values in (2.10.7), the desired 95 percent 
confidence interval for a2 is given by 


p 73.150(5.753 — 3.804)(5.753 + 3.804 — 2.790) 
4 x 5.753 x 2.790 


» _ 73.150(5.753 — 0.116)(5.753 + 0.116 — 0.121) 
On <-—_——-, eee  —-?o—r20000 ODD0wWww 


= 0.95 
4 x 5.753 x 0.121 


Or 
P{15.027 < of < 851.215} =0.95. 


Likewise, to obtain a 95 percent confidence interval for a2 from (2.10.8), we 
compute the following quantities: 
G, = 0.6416, G>) = 0.4536, A; = 7.2645, A> = 1.3981, 
Gi. = —0.1289, Ay. = —1.1587, 
V, = 79,392.0996 and Vy = 9,311,190.0700. 


Substituting appropriate values in (2.10.8), we obtain 
P{16.477 < 02 < 849.775} =0.95. 


Note that this interval results in a slightly shorter interval than the intervals for 
o2 reported previously. 

Finally, substituting the appropriate values in (2.10.10) through (2.10.12), 
the desired 95 percent confidence intervals for 02/02, 02/(a2 + 02), and o2/ 
(oc? + o2) are given by 


7545 £7345 
pie 2 ani ne Oe ft Gl | eos 
4\ 3.804 o2 ~ 4\0.116 


Or 
o2 
p|0.128 <— < 12.149] — (0.95, 
O-; 

5.753 — 3.804 a2 5.753 — 0.116 | 
a ee 5 
5.753 + (4—1)3.804 o2+02 ~ 5.753 +(4—1)0.116 

Or 


2 
P|0.114 < —*— <0.924| =0.95, 
oz-+o2 


é a 


48 The Analysis of Variance 


TABLE 2.9 
Data on Yields of Six Varieties of Corn (in bushels per acre) 


Varieties 


Four Country Silver King lodent Lancaster Osterland = Clark 


7.3 bel 6.9 9.6 4.8 4.3 
4.5 5.4 6.8 7.8 92 8.4 
74 5.2 7.6 9.6 8.5 6.6 
7.4 4.0 8.1 7.7 8.8 4.9 
5.0 9.4 8.2 7.9 5.8 
5.9 12.0 7.3 5.9 7.6 
6.4 15.9 11.3 9.2 3.7 
6.3 7.4 9.5 

5.0 9.0 8.8 

6.1 5.2 8.4 

7.9 9.2 - 6.8 

5.7 8.6 


Source: Snedecor (1934). Used with permission. 


and 
4x 0.116 a? 4 x 3.804 
a es GE ee EOS 
5.753 + (4 — 1)0.116 a2 + ose 5.753 + (4 — 1)3.804 
or 


2 
P| 0.076 < —-£— < 0.887] —0.95. 
o2 + o2 


é a 


For another example involving unequal sample sizes, consider the data from 
an experiment reported by Snedecor (1934) who compared the yields of a 
number of varieties of corn, each variety being represented by several inbred 
lines. The data on yields (in bushels per acre) for six varieties of their inbred 
lines are given in Table 2.9. 

The data in Table 2.9 should again be analyzed under Model II since each 
variety of corn is being represented by several inbred lines and the results of 
the analysis are to be applicable for all the varieties. Here, a = 6, n; = 12, 
ny = ans = 12g Sng = 7, 16 = 1, NS ny = 53, ig = 
(N? — ey n?) /N(a— 1) = 8.626; and the resultant calculations for the sums 
of squares are given in the following. 

The marginal totals for yields corresponding to six varieties of corn are 


yy = 74.9, y2, = 223; y= 106.1, 0 95.0, y5, = 54.3, v6. = 41.3; 


One-Way Classification 49 
and the grand total is 


y, = 393.9. 


The other quantities needed in the calculations of the sums of squares are ob- 
tained as 


2 2 
¥. _ G93." _ 5 997.495, 
N 3 
fy? (74.9)? = (22.3) 106.1)2 (95.0)? (54.3) (41.3)7 
Ye aE og EE) ge OO Oe ee 
n; 12 4 12 11 7 7 


= 3,015.262, 


and 


SS) yj =(7.3) + 4.5) +--+ + (3.7) = 3,174,010. 
i=1 j=l 


The resulting sums of squares are, therefore, given as 


SS; = yy - = 3,174.010 — 2,927.495 
i=) j= 
= 246.515, 
“yy? 
SSp = Y° = — = =3,015.262 — 2,927.495 
marti COUN 
= 87.767, 


and 


SSw = 3 y 93 - >i yi = 3,174.01 — 3,015.262 


i=). j= i=l nj 


= 158.748. 


Finally, the results of the analysis of variance calculations are summarized in 
Table 2.10. Here, the null hypothesis states that the variation in yields among 
varieties of corn is due entirely to the natural variability among replicates, 
whereas the alternative hypothesis states that there is an additional variability 
among varieties of corn. If we choose the level of significance a = 0.05, we 
find from Appendix Table V that the 95th percentage point of the F distribution 
with 5 and 47 degrees of freedom is 2.41. Since the computed F value of 5.196 


50 The Analysis of Variance 


TABLE 2.10 

Analysis of Variance for the Yields Data of Table 2.9 

Source of Degrees of Sum of Mean Expected 

Variation Freedom Squares Square MeanSquare FValue _ p-Value 

Between 5 87.767 17.553 024862602 5.196 <0.001 
Varieties 

Within 47 158.748 3.378 3=— a? 
Varieties 

Total 52 246.515 


from Table 2.10 is greater than 2.41 (p < 0.001), we may conclude that a2 > 0, 
or that the mean yields of the varieties differ significantly. 

Furthermore, if the experimenter 1s interested 1n estimating the magnitudes of 
the components of variance o2 and a2, we may obtain their unbiased estimates 
using the formulae (2.12.1) and (2.12.2). Hence, from (2.12.1) and (2.12.2), we 
find that 


6? = 3.378 
and 


» _ 17.553 — 3.378 


= 1.643. 
8.626 oo 


6 


2 


The estimate of the total variance Oy 


is then given by 


65 = 67462 = 3.378 + 1.643 
= 5.021, 


and the estimated proportion of the total variance accounted for by the varieties 
1S 


1.643 
= —— = 0.327. 
5.021 


Q>] & 
we NIQ WN 


Thus, we observe that about 33 percent of the variance among yields seems to 
be due to differences among varieties. The remaining 67 percent of variance 
can be attributed to within-variety variation. 

To obtain an exact 95 percent confidence interval for 07, we have 


MSy = 3.378, 2[47, 0.025] = 29.956, and y2[47, 0.975] = 67.821. 


Substituting these values in (2.12.3), the desired 95 percent confidence interval 


One-Way Classification 51 


for o? is given by 


3 x 3.378 , 47x 3.378 


<o. < ——— [| = 0.95 
67.821 7 29.956 


or 
P[2.341 < a? < 5.300] = 0.95. 
Now, to obtain an approximate confidence interval for a2, we have 


F[5, 47;0.975] = 2.851, F[5, 47;0.025] = 0.163, 
F[5, 00; 0.975] = 2.570, F[5, 00; 0.025] = 0.166, 
no = 8.626, nj = 7.563, 

F* =5.196, MS, = 15.591, 

L = —0.036, and U =3.66l. 


Substituting these values in (2.12.4), the desired 95 percent confidence interval 
for o2 is given by 


P[—0.030 < a? < 11.985] =0.95. 


Z 


Furthermore, to obtain an approximate confidence interval for 02/02, we 


substitute the required quantities in (2.12.5), which yields the desired interval as 


2 
P| -0.036 < =t < 3.61 | = 0.95. 
é 
It is to be understood that the negative limits are defined to be zero. It is, however, 
informative to leave them with negative signs. 
Similarly, to obtain an approximate 95 percent confidence interval for the 
intraclass correlation, we have 


F* =5.195, F[5,47;0.025] = 0.163, and F/[5,47;0.975] = 2.851. 


Substituting these values in (2.12.6), the desired 95 percent confidence interval 
for the intraclass correlation is given by 


a 


(5.196/2.851) + (8.6261) o2+02  (5.196/0.163) + (8.626 — 1) 
~ 0.95 


(5.196/2.851) — 1 a? (5.196/0.163) — 1 


52 The Analysis of Variance 


Or 


Bs 


p|0.087 2 —e 2 0.782| (95. 
o? +02 ; 


é a 


2.15 USE OF STATISTICAL COMPUTING PACKAGES 


One-way analysis of variance can be performed by a number of statistical pack- 
ages using either a mainframe or a microcomputer. SAS, SPSS, and BMDP each 
contains various procedures to perform one-way analysis of variance. However, 
PROC ANOVA of SAS, ONEWAY of SPSS, and BMDP 7D are more suited 
for simple one-way designs including both balanced and unbalanced data sets. 
For the random effects model involving variance component estimation, one 
may prefer to use SAS GLM, SPSS GLM, and BMDP 8V or 3V. The out- 
put from these procedures provides an analysis of variance table, treatment 
means, and their standard errors. BMDP 7D has an extra feature of printing 
comparative histograms and all descriptive measures of location and variability 
for each group and for combined data. This provides an effective visual aid 
in making comparisons between different group means. To obtain similar des- 
criptive measures using SAS, one can use the MEANS statement in GLM or 
UNIVARIATE procedure for each group and for combined data. For an in- 
troduction to SAS, SPSS, and BMDP procedures for performing analysis of 
variance, see Chapter 11. 


2.16 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the applications of statistical packages to perform 
one-way analysis of variance for the data sets employed in examples presented 
in Sections 2.13 and 2.14. Figures 2.2, 2.3, 2.4, and 2.5 illustrate the program 
instructions and the output results for analyzing data in Tables 2.3, 2.5, 2.7, and 
2.9 using SAS ANOVA/GLM, SPSS ONEWAY/GLM, and BMDP 7D/8V/3V 
procedures. The typical output provides the data format listed at the top followed 
by the number of observations for each factor level, estimates of the factor level 
means, and the entries of the analysis of variance table. Note that in each case the 
results are the same as those obtained using manual computations in Sections 
2.13 and 2.14. 


2.17 POWER OF THE ANALYSIS OF VARIANCE F TEST 


The power of the F test of the analysis of variance is important in evaluating 
the sensitivity of the test and also in determining sample size needed to attain 
a given value of the power. We recall that the power of a test refers to the 
probability that the decision procedure will reject the null hypothesis when in 


One-Way Classification 53 


The SAS System 
Analysis of Variance Procedure 
Dependent Variable: YIELD 


DATA WHEATYLD; 

INPUT VARIETY YIELD; 
DATALINES; 

1 96 

137: Mean 
Square 
980.0000 


163.6000 


Sum of 
Squares 
2940.0000 
3272.0000 


Pr > EF 
0.0044 


F Value 
5.99 


Source DF 
Model 3 
: Error 20 
ROC ANOVA; 
LASSES VARIETY; Corrected 23 6212.0000 
Total 
YIELD Mean 


76.000 


C.V. 
16.82977 


Root MSE 
12.791 


R-Square 
VALUES 0.473278 
1234 


IN DATA 


LEVELS 
VARIETY 4 
NUMBER OF OBS. 
SET=24 


Anova SS Mean Square F Value Pr > F 
2940.0000 980.0000 5.99 0.0044 


Source DF 
VARIETY 3 


(i) SAS application: SAS ANOVA instructions and output for the one-way fixed effects analysis 


of variance. 


ATA LIST Test of Homogeneity of Variances 


VARIETY 1 
Levene df1 df2 Sig. 
Statistic 
1.402 3 20 
ANOVA 


Mean 
Square 


Sum of 
Squares 


2940.000 
3272 .000 
6212.000 


980.000 
163.600 


ONEWAY YIELD BY 
VARIETY (1,4) 
/STATISTICS=ALL. 


Between Groups 
Within Groups 
Total 


(ii) SPSS application: SPSS ONEWAY instructions and output for the one-way fixed effects 


analysis of variance. 


BMDP7D — ONE- AND TWO-WAY ANALYSIS OF VARIANCE WITH 
DATA SCREENING Release: 7.0 (BMDP/DYNAMIC) 


FILE='C: \SAHAI 
\TEXTO\EJE1.TXT'. 
FORMAT=FREE. 
VARIABLES=2. | ANALYSIS OF VARIANCE TABLE FOR MEANS 
NAMES=VART, YIELD. | SOURCE SUM OF DF MEAN F VALUE 
CODES (VART)=1, 2, ~SQUARE-- SQUARE 
3,4. 2940.0000 3 980.0000 
NAMES (VART)=I, II, 3272.0000 20 163.6000 
IIt,Iv. 


/INPUT 


/VARIABLE PROB. 
/GROUP 
| VARIETY 


| ERROR 


5.99 


/HISTOGRAM 


GROU PING=VART. 
VARIABLE=YIELD. 


EQUALITY OF MEANS TESTS; 
VARIANCES ARE NOT ASSUMED TO BE EQUAL 


WELCH 


8.94 


0.0028) 
0.0113] 


/END 
1 96 
1 37 
ae |LEVENE'S TEST FOR VARIANCES 3, 
4 68 


BROWN-FORSYTHE 


20 


(iii) BMDP application: BMDP 7D instructions and output for the one-way fixed effects analysis 


of variance. 


FIGURE 2.2 Program Instructions and Output for the One-Way Fixed Effects 
Analysis of Variance: Data on Yields of Four Different Varieties of Wheat 
(Table 2.3). 


54 The Analysis of Variance 


The SAS System 
Analysis of Variance Procedure 


DATA BLOODANA; 
INPUT DRUG BLOODPH; 
DATALINES; 

1 19 

111 


Dependent Variable: BLOODPH 
Sum of 
Squares 

258.00000 
76.00000 


Mean 
Square 
64.50000 

6.33333 


F Value 
10.18 


Pr > F 
0.0008 


DF 
4 
12 


Source 
Model 
Error 


5 2 


, 

PROC ANOVA; 
}CLASSES DRUG; 

MODEL BLOODPH=DRUG; 
RUN; 
CLASS 


Corrected 16 334.00000 
Total 
c.V. 


35.95159 


Root MSE 
2.5166 


BLOODPH Mean 
7.0000 


R-Square 
VALUES 0.772455 
12345 


IN DATA 


LEVELS 
5 
i NUMBER OF OBS. 


DF Anova SS 


258.00000 


Mean Square F Value Pr > F 
64.50000 10.18 0.0008 


Source 


(i) SAS application: SAS ANOVA instructions and output for the one-way fixed effects analysis 


of variance with unequal numbers of observations. 


DATA LIST Test of Homogeneity of Variances 
/DRUG 1 
BLOOPH 3-4. 
BEGIN DATA. 
119 
111 

15 

7 


Levene dfi df2 
Statistic 


-448 


Sig. 


BLOODPH 4 


ANOVA 


‘ Sum of 
2 Squares 
END DATA. 

ONEWAY BLOODPH BY | BLOODPH 
DRUG (1,5) 
/STATISTICS=ALL. 


258.000 
76.000 
334.000 


Between Groups 
Within Groups 
Total 


(ii) SPSS application: SPSS ONEWAY instructions and output for the one-way fixed effects 


analysis of variance with unequal numbers of observations. 


/ INPUT FILE ='C:\SAHAI 


\TEXTO\EJE2.TXT'. 
FORMAT = FREE. 
VARIABLES=2. 
/VARIABLE NAMES=DRUG, BLOOPH. 
/GROUP CODES (DRUG) =1, 2,3, 
4,5. 
NAMES (DRUG) =A, B,C, 
D,E. 
/HISTOGRAM GROUPING=DRUG. 
VARIABLE=BLOOPH. 
/END 


BMDP7D - ONE= AND TWO-WAY ANALYSIS OF VARIANCE WITH 
DATA SCREENING Release: 7.0 (BMDP/DYNAMIC) 


| ANALYSIS OF VARIANCE TABLE FOR MEANS 

| SOURCE SUM OF DF MEAN F VALUE 
SQUARES SQUARE- - 
258.0000 64.5000 
76.0000 


PROB. 


| DRUG 10.18 0.0008 


| ERROR 12 
QUALITY OF MEANS TESTS; 
VARIANCES ARE NOT ASSUMED TO BE EQUAL 
| WELCH 


119 
Bes fe! 


| BROWN-FORSYTHE 


2 


5 


(iii) BMDP application: BMDP 7D instructions and output for the one-way fixed effects 
analysis of variance with unequal numbers of observations. 


FIGURE 2.3 Program Instructions and Output for the One-Way Fixed Effects 
Analysis of Variance with Unequal Numbers of Observations: Data on Blood Anal- 
ysis of Animals Injected with Five Drugs (Table 2.5). 


One-Way Classification 55 


DATA INTERRAT; The SAS System 
INPUT STAFF RESP; General Linear Models Procedure 
DATALINES; Dependent Variable: RESP 
Sum of Mean 
75 Source DF Squares Square F Value Pr > F 
Model 4 1683.3000 420.8250 5.75 0.0052 
Error 15 1097.2500 73.1500 
Corrected 19 2780.5500 
Total 


R-Square C.V. Root MSE RESP Mean 


; 0.605384 10.51356 8.5528 81.350 

PROC GLM; 

CLASSES STAFF; Source DF Type I SS Mean Square F Value Pr > F 
MODEL RESP=STAFF; STAFF 4 1683.3000 420.8250 5.75 0.0052 
RANDOM STAFF; Source DF Type III SS Mean Square F Value Pr > F 
RUN; STAFF 4 1683 .3000 420.8250 5.75 0.0052 
CLASS LEVELS VALUES 

STAFF 5 12345 Source Type III Expected Mean Square 

NUMBER OF OBS. IN DATA SET=20 | STAFF Var (Error)+4 Var (STAFF) 


(i) SAS application: SAS GLM instructions and output for the one-way random effects analysis 


of variance. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: RESP 
/STAFF 1 
RESP 3-4. Source Type III Ss df Mean Square F Sig. 
BEGIN DATA. 

86 STAFF Hypothesis 1683.300 4 420.825 5.753 .005 

75 Error 1097.250 15 73.150 (a) 

94 a MS(Error) 

86 

67 Expected Mean Squares (a,b) 

86 Variance Component 

: Source Var (STAFF) Var (ERROR) 

91, STAFF 4.000 1.000 
END DATA. ERROR -000 1.000 
GLM RESP BY a For each source, the expected mean Square equals the sum of the 
STAFF coefficients in the cells times the variance components, plus 
/DESIGN STAFF aquadratic term involving effects in the Quadratic Term cell. 
/RANDOM STAFF. |b Expected Mean Squares are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the one-way random effects 


analysis of variance. 


/ INPUT FILE='C: \SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE - 
\TEXTO\EJE3.TXT'. EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=4. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 

/VARIABLE NAMES=R1,R2,R3,R4. 

/DESIGN NAMES=STAFF, RESP. SOURCE ERROR SUM OF : F 
LEVELS=5, 4. TERM SQUARES SQUARE 
RANDOM=STAFF, 

RESP. MEAN STAFF 132356.450 132356.5 314.52 0.0001 
MODEL='S, R(S)'. STAFF R(S) 1683.300 420.8 5.75 0.0052 
/END R(S) 1097.250 73.2 
86 75 94 86 


85 84 92 91 SOURCE EXPECTED MEAN ESTIMATES OF 
ANALYSIS OF VARIANCE DESIGN SQUARE VARIANCE COMPONENTS 
INDEX STAFF RESP 

NUMBER OF LEVELS 5 4 20 (1) +4 (2)+(3) 6596.78125 
POPULATION SIZE INF INF 4(2) + (3) 86.91875 
MODEL ~°S, R(S) (3) 73.15000 


(iii) BMDP application: BMDP 8V instructions and output for the one-way random effects 


analysis of variance. 


FIGURE 2.4 Program Instructions and Output for the One-Way Random Ef- 
fects Analysis of Variance: Data on Interview Ratings by Five Staff Members 
(Table 2.7). 


56 The Analysis of Variance 


DATA CORNVARI; The SAS System 
INPUT VARIETY YIELD; General Linear Models Procedure 
DATALINES; Dependent Variable: YIELD 
Sum of Mean 
Source DF Squares Square F Value Pr > F 
Model 5 87.767041 17.553408 5.20 0.0007 
Error 47 158.748431 3.377626 
Corrected 52 246.515472 


Total 


PROC GLM; R~-Square c.V. Root MSE YIELD Mean 
CLASSES VARIETY; 0.356031 24.72838 1.8378 7.4321 

MODEL YIELD = VARIETY; 

RANDOM VARIETY; Source DF Type I SS Mean Square F Value Pr > F 
RUN; VARIETY 5 87.767041 17.553408 5.20 0.0007 
CLASS LEVELS VALUES Source DF Type III SS Mean Square F Value Pr > F 
VARIETY 6 12 3 4 5 | VARIETY 5 87.767041 17.553408 5.20 0.0007 


6 
NUMBER OF OBS. IN DATA Source Type III Expected Mean Square 
SET=53 VARIETY Var(Error) + 8.6264 Var (VARIETY) 


(i) SAS application: SAS GLM instructions and output for the one-way random effects analysis 


of variance with unequal numbers of observations. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: YIELD 
/VARIETY 1 
YIELD 3-6(1) Source Type III SS df Mean Square FE Sig. 
BEGIN DATA. VARIETY Hypothesis 87.767 5 17.553 5.197 .001 
Error 158.748 47 3.378 (a) 
a MS (Error) 


Expected Mean Squares (a,b) 
Variance Component 

‘ Source Var (VARIETY) Var (ERROR) 
6 3. VARIETY 8.626 ' 1.000 
END DATA. ERROR -000 1.000 
GLM YIELD BY a For each source, the expected mean square equals the sum of the 
VARIETY coefficients in the cells times the variance components, plus a 
/DESIGN VARIETY | quadratic term involving effects in the Quadratic Term cell. 
/RANDOM VARIETY. |b Expected Mean Squares are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the one-way random effects 


analysis of variance with unequal numbers of observations. 


/ INPUT FILE='C: \SAHAI BMDP3V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE4.TXT'. Release: 7.0 (BMDP /DYNAMIC) 
FORMAT=FREE. DEPENDENT VARIABLE YIELD 
VARIABLES=2. 

NAMES=VAR, YIELD. PARAMETER ESTIMATE STANDARD EST/ TWO-TAIL PROB. 

CODES (VAR)=1,2,3, ERROR ST.DEV. (ASYM. THEORY) 
4,5,6. ERR. VAR. 3.377 0.696 

NAMES (VAR) =F;S,I, CONSTANT 7231 0.584 12.376 0.000 


L,O,C. RAND( 1) 1.619 1.294 
DEPENDENT=YIELD. 
RANDOM=VAR. TESTS OF FIXED EFFECTS BASED ON ASYMPTOTIC VARIANCE 
METHOD=REML. -~COVARIANCE MATRIX 


SOURCE F-STATISTIC DEGREES OF PROBABILITY 
FREEDOM 
CONSTANT 153.18 1 52 0.00000 


(iii) BMDP application: BMDP 3V instructions and output for the one-way random effects 


analysis of variance with unequal numbers of observations. 


FIGURE 2.5 Program Instructions and Output for the One-Way Random Effects 
Analysis of Variance with Unequal Numbers of Observations: Data on Yields of 
Six Varieties of Corn (Table 2.9). 


One-Way Classification 57 


fact the null hypothesis is false. We now illustrate the calculation of the power 
of the F test for Models I and II separately. 


MODEL | (FIXED EFFECTS) 


It follows from (2.6.3) and (2.6.6) that the power of the F test for the hypothesis 
Ho:a; =O =1,...,a) 1s given by 


1—B = P{F'[a—1,a(n—1);A] > Fla —1,a(n—1);1-—a]}, (2.17.1) 


where F'[a — 1, a(n — 1); 1 — a] is the 100(1 — @)-th percentage point of the F 
distribution with a— 1 and a(n — 1) degrees of freedom and F’[a—1, a(n—1);A] 
is a Statistic having a noncentral F distribution (see Appendix G) with a — 1 
and a(n — 1) degrees of freedom and the noncentrality parameter 4 given by 


a 
n 
—— e 2.17.2 
20° a ( ) 


When there are unequal numbers of observations (n;) at different factor levels, 
the noncentrality parameter takes the form 


b. $e. . od 
ie Ie me (2.17.3) 


Remark: To evaluate the probability expression given by (2.17.1), one needs to employ 
noncentral F tables which involve evaluation of the integrals for the noncentral F 
distributions. Such tables of power or the probability expression (2.17.1) have been 
calculated by Tang (1938) and by Tiku (1967, 1972). To tabulate the power for each a, 
it would require a triple-entry table consisting of v,, v2, and A. Tang’s tables give the 
probability of type II error, 6, corresponding to the degrees of freedom v; = 1 (1) 8, v2 = 
2, 4, 6, 7 (1) 30, 60, 00; normalized noncentrality parameter @ = {24/(v; + 1)}/? = 
1(0.5) 3(1)8; and level of significance a = 0.05 and 0.01. The computations of these 
probabilities are based upon the incomplete noncentral beta distribution. These tables are 
reproduced in Graybill (1961, pp. 444-459). Tiku’s tables give the power corresponding 
tov; = 1(1)10, 12; v. = 2 (2) 30, 40, 60, 120, 00; d = 0.5, 1.0 (0.2) 2.2 (0.4) (3.0); and 
a = 0.005, 0.01, 0.025, 0.05, and 0.10 and are also reproduced in Graybill (1976, pp. 
672-686). An abriged version of these tables, a = 0.01, 0.05, and 0.10, is also reprinted 
in this volume as Appendix Table VII. Among other tables, Lehmer (1944) calculated @ 
as a function of (a, 1 — B, v), v2) fora = 0.01, 0.05; 1 — B = 0.7, 0.8. More extensive 
tables, which also include Lehmer tables, were published by the National Bureau of 
Standards, Washington, DC in 1960, fora = 0.01, 0.02, 0.05, 0.10; 1 — 6 = 0.10, 0.50, 
0.90, 0.95, 0.99; except (a, B) = (0.10, 0.10), (0.20, 0.10); v; = 1 (1) (10), 12, 15, 20, 
24, 30, 40, 60, 120, 00; and v2 = 2 (2) 12, 20, 24, 30, 40, 60, oo. 


In addition to the tables previously described, the computation of the power of 
the F test is facilitated by the availability of power charts, prepared by Pearson 


58 The Analysis of Variance 


and Hartley (1951) and Fox (1956), which make calculation of the probabilities 
(2.17.1) quite simple. Pearson-Hartley charts, reproduced in Appendix Chart II, 
are uSed as follows: 


(i) The charts are given for vy; = 1,2,..., 8, the value of which is shown 
in the upper left-hand corner of each chart. 

(ii) Two levels of significance, namely, w = 0.05 and a = 0.01, are given 
in the charts. 

(iii) There are two X-scales (abscissas) corresponding to the two significance 
levels used. The left set of curves for each chart corresponds toa = 0.05 
and the right set refers to a = 0.01. 

(iv) Separate curves are provided for different values of v2. For each chart, 
the values of v2 are given at the top of the chart. Because only selected 
values of v2 are provided in the charts, an interpolation is generally 
required for intermediate values of v2. There are eight curves corre- 
sponding to v2 = 6, 7, 8, 9, 10, 12, 15, 20, 30, 60, oo. 

(v) The X-scale (abscissa) represents ¢, the normalized noncentrality pa- 
rameter, and the Y-scale (ordinate) represents the desired power (1 — B) 
as defined in (2.17.1). 


Remarks: (i) For the calculation of the power (1 — 8), we need to know the value of ¢ 
beforehand (whose exact value is unknown). To estimate the value of ¢, one requires the 
knowledge of w;’s and the error variance o?. An estimate of a? can be obtained based on 
previous experimentation or a pilot study. If several estimates are available, the largest 
one should be taken. The larger the value of o7, the larger will be the value of n required 
to achieve a given level of power. 

(ii) A special condensation of the Pearson and Hartley charts was published by Duncan 
(1957), who plotted on a single set of axes the values of @ corresponding to 1 — B = 0.5 
and 0.90 for various values of v,; and v2. Separate charts were presented for a = 0.05 
and 0.01. Having v; and v2 on the same chart facilitates the computations involving both 
of these degrees of freedom. 

(iii) The Fox charts, constructed using the Tang and Lehmer tables, are useful in 
determining the design parameters (combination of v; and v2 values) in order to obtain 
a desired power against a specified alternative. The charts consist of the following: 


(a) Ina (1, ¥2)-plane and for fixed values of a and £, the charts give the contours 
of @. These are curves on which @ has certain constant values. 

(b) Because of the choice of a reciprocal scale for v, and v2, these curves appear to 
be nearly straight lines. 

(c) The curves are arranged in eight separate charts for wa = 0.01 and 0.05, and 
1— 6 = 0:5, 0.7, 0.8, and 0.9. 

(d) There are two monograms included among the Fox charts. They are designed 
to facilitate the interpolation to values of 1 — 6 different from 0.5, 0.7, 0.8, and 
0.9. 


The Fox charts, reprinted in Scheffé (1959, pp. 446-455), are used as follows. For a 
given pair of values (8, ¢), the point corresponding to this pair is located in each of the 
two grids and the straight line drawn through these two points is then the (approximate) 
contour of @. 


One-Way Classification 59 


(iv) The Pearson and Hartley charts and the Fox charts serve a somewhat com- 
plementary role: the former being designed for v; = 1, 2,...,8; and the latter for 
Vv) = 3 43, ney OO; 


Example 1. Consider the pharmaceutical research example described in 
Section 2.13 and suppose that the company wishes to know the power of 
the F test of the experiment when there are substantial differences between 
mean blood pH readings for different drugs. More specifically, suppose 
one wishes to consider the case whena, = 8, a> = —3, a3 = —l,a4 = 0, 
and a5 = —3. From (2.17.3), the value of the noncentrality parameter A is 


_ ol 


= 5 glt3(8)° + 4(—3)" + 3-1? + 400)" + 3(-3)'H 


Xr 


1 
— (258.0), (2.17.4) 
202 


e 


where an estimate of o? is obtained from MSw = 6.333. Substituting the 
value of o? in (2.17.4), we get 


(258.0) = 20.37 


———————— 
2(6.333) 


and the normalized noncentrality parameter is 


Furthermore, for this example, we have v; = 4 and vz = 12. Hence, 
for a = 0.01, we find from the Pearson-Hartley charts given in Appendix 
Chart II that the power is approximately 0.94. The use of Tiku’s tables given 
in Appendix Table VII with appropriate interpolation gives essentially the 
same result. Thus, there are about 94 chances in 100 that the F test will 
detect the differences in the mean blood pH readings of the five drugs 
given the specified differences in their magnitude. 


In addition to tables and charts described above one can use a normal ap- 
proximation to the distribution of the square root of the noncentral F variate 
to evaluate the power expression (2.17.1). For example, it can be shown that 
(see, e.g., Johnson et al. (1995, pp. 491—492)). 


yf vivy! (2v2 — 1) Flv, v23 A] — V2(1 + 22) — (yy + 2A) + 4A) 
VY vy) Flv, v2; A] + (41 + 2A) + 4A) 


(2.17.5) 


60 The Analysis of Variance 


is approximately normally distributed with mean zero and standard deviation 
one.'* Applying the normal approximation (2.17.5) to equation (2.17.1), it fol- 
lows that (see, e.g., Fleiss (1986, pp. 372—373); Day and Graham (1991)) 


Di Pp SP (ZS 2-8), 


where 
oo oe J v2 [2(y, + 2A)? — (1; + 4A)] — Vv (vy, + 24) 202 — 1) Fy, v231 -—a@] 
— Jvi (uy; + 2A)F IY, 9s 1 — a] + ¥9(0; + 4A) | 


Example 2. For Example 1 considered previously, v} = 4, v2 = 12, 
dX = 20.37, a = 0.01, F[4, 12; 0.99] = 5.41, and z;_g 1s determined by 


12[2(4 + 2 x 20.37)? — (4+4 x 20.37)] — /4(4+ 2 x 20.37)(2 x 12 —1) x 5.41 
JV4(4 +2 x 20.37) x 5.41 + 12(4+ 4 x 20.37) 


£1-B = 
= 1.51. 


Now, from Appendix Table I, the power is given by 
1—B~= P(Z < 1.51) = 0.93, 


which gives nearly the same value as that obtained earlier using Pearson 
Hartley charts and Tiku’s tables. 


MODEL II (RANDOM EFFECTS) 


Under Model II, it follows from (2.7.10) that the power of the F test for the 
null hypothesis Hp :o2 = 0 is given by 


Fla — l,a(n — 1);1-—a] 


2.17.6 
1+no2/o2 


= Pf ra l,a(n — 1)] > 


Thus, in the case of Model I, the power of the F test depends only upon the 
(central) F distribution and is, therefore, more readily calculable than the power 
under Model I, which involves the noncentral F distribution. Furthermore, it 
is readily seen that the power of the F test for the more general hypothesis 


14 More accurate approximations of the noncentral F variate to evaluate the power expression 
(2.17.1) can be based on the central F distribution. For a discussion of these and some other 
approximations to the noncentral F distribution, see Johnson et al. (1995, pp. 491-495). 


One-Way Classification 61 


(2.7.11) will be given by 


1+Nn po 
P{ Fla —1,a[n — 1]] > ————- F[a — l, a(n — 1); 1 —-a]}, 
1+no2/o? 


(2.17.7) 


which again involves only the central F distribution.!> 

To simplify the computation of probabilities (2.17.6) and (2.17.7), curves 
giving 1—power have been drawn (see, e.g., Bowker and Lieberman (1972, pp. 
309-313)). These are similar to the Pearson-Hartley charts for the fixed effects 
case described earlier. These curves are reproduced in Appendix Chart III. 


Example 3. Suppose that a = 4,n = 6, 02/02 = 2.5, and @ is taken to 
be 0.05. From Appendix Table V, we then find that F'[3, 20; 0.95] = 3.10, 
and using (2.17.6) the power of the test is given by 


— P{F[3, 20] > 3.10/(1 + 6 x 2.5)} 
— P{F[3, 20] > 0.19} 
— 0.90. 


2.18 POWER AND DETERMINATION OF SAMPLE SIZE 


We have seen in the preceding section that under Model I, once )>"_, a? and a? 
have been specified, the power of the F test becomes merely a function of n. By 
making n suitably large, the power of the F test can be made accordingly large 
for any nonzero specified value of }~V_, «?/a7. That is, for a fixed value of 
yy a? lene as n increases, ¢@ also increases. The larger @¢, for a fixed level of 
significance a, the larger is the power. Therefore, the sample size n can be deter- 
mined so as to make the power of the F test sufficiently large, say, 0.80 or 0.90 
with respect to the specified values of }“"_, a? and o?2. Equivalently, the sample 
size can be determined so as to make the experiment sensitive enough to detect 
differences in the parameters @;’s that are considered large enough to be of prac- 
tical importance. As mentioned earlier, an estimate of a? can be obtained either 
from a pilot survey or from previous experimentation. However, the problem 
of specifying }“¢_, a? is rather a difficult one. Generally, )-7_, a? should be 
taken as the smallest value that is considered to be of practical importance. An 
expert judgment supported by a reasonable rationale is generally required. 


'S For discussions of the power function under nonnormality, see Tan and Wong (1980) and Singhal 
and Sahai (1994). For discussion of a general approach for making power calculations in the 
most frequently encountered statistical applications, including a fixed effect analysis of variance 
model, without reference to any tables and charts, see Wheeler (1974). 


62 The Analysis of Variance 


Example 1. Consider an example witha = 3 and the overall F test to be 
made at the level of significance a = 0.05. From tables of the noncentral 
F distribution, the power of the overall test may be determined for various 
combinations of @ and n. Suppose the minimum value of a; thought to be 
of practical importance is 3.0. If a2 is estimated to be 25, then 


[n(32 + 32 + 32) 
= Q. _ a 2. ae = 0.6 Sn. 


Hence, for this special case, using Appendix Table VII, the values of 


@ associated with various values of n and the corresponding power are 
determined as follows: 


3 5 7 9 1] 21 
1.0 1.3 1.6 1.8 2.0 2.7 
1—B: 021 O31 0.62 0.75 0.85 0.99 


If the power of 0.80 is considered to be appropriate, then letting n = 10 
will provide a test having approximately this power. If the power of 0.90 
is desired, then n should be between 11 and 21. It can be seen that when 
n = 13, the power is approximately 0.90. 


To facilitate the computation of the sample size in the absence of noncentral 
F distribution tables, special tables and charts have been prepared by Feldt and 
Mahmoud (1958a, b), Kastenbaum et al. (1970a), Bowman and Kastenbaum 
(1975), Cohen (1988, Chapter 8), and Day and Graham (1991). Here, we briefly 
describe the use of Feldt-Mahmoud charts. The charts, reproduced in Appendix 


Chart IV, give the values of n (Y-scale) as a function of ¢’ = [Si _, a? /a (X- 
scale) for specified values of the number of factors (a), the level of significance 
(a), and the power (1 — 8). The only difference between ¢ and ¢’ is that ¢’ 
does not involve n, because we wish to determine it. Thus, the relation of ¢’ to 


@ 1s simply 
g' = o/Jn. 


Two levels of significance are used in the charts, namely, a = 0.05 anda = 
0.01. The charts are given forr = a = 2, 3, 4, 5 and the values of 1 — 6B 
employed in the charts are: 0.5, 0.7, 0.8, 0.9, and 0.95. There are two X-scales 
depending on which level of significance is employed. Furthermore, the left set 
of curves on each chart refers to a = 0.05 and the right set tow = 0.01. There 
are separate curves for different values of P = 1 — B and the curves are indexed 
according to the value of P at the top of the chart. Since only the selected values 
of P are used in the chart, one needs to interpolate for intermediate values of 
1 — B. The sample size n may be read from the ordinate of the curve. 


One-Way Classification 63 


Example 2. Suppose an engineer wishes to determine whether four lead- 
ing brand names of light bulbs have the same mean life. The overall F test 
is to be made at a = 0.05 with a power of 1 — 8 = 0.9 and the value of ¢’ 
is found to be 0.8. To determine the required sample size, we refer to the 


chart for r = 4, locate the curve for P = 0.9 in the left set of curves cor- 
responding to aw = 0.05, and read the value of the ordinate at ¢’ = 0.8 on 
the X-scale. We find the value of n to be approximately equal to 7. Hence, 
7 bulbs of each brand should be tested to meet the given specifications. 


SAMPLE SIZE DETERMINATION USING SMALLEST 
DETECTABLE DIFFERENCE 


The problem of sample size determination considered previously makes use of 
the specified values of }-/_, «? and a2. However, as indicated there, the task 
of specifying )““_, a? is a difficult one. Moreover, for fixed effects, )-/_, a? 
is difficult to interpret as a meaningful measure. Instead, one can define the 
sensitivity in terms of the magnitude of the difference between any pair of the 
a;’s, say, A, which is meaningful to detect with a reasonably high probability. 
Suppose that for at least one pair of treatments |a; — a;| > A (i # j). Now, the 
minimum value of A (Amin) in (2.17.2), subject to the condition that at least two 
of the a;’s differ by A or more, occurs when the q@;’s are such that |a; — a,| > A 
and a; = O for all i 4 j,i # k. That is, when only two of the a;’s differ by A 
and the remaining a — 2 @;’s are zero.'® If the specified power of the test, 1 — B, 
is determined for Amin, then since power increases with 4, the power will be at 
least as large for all sets of a;’s satisfying the previous condition. 
Now, from equation (2.17.2), it follows that 
A2 
nx2x 7 nd? 


— ; 
sae 20? 40? 


Again, for a given value of A, knowledge of o, is necessary to determine the 
required value of the sample size n. Since o, is often not known precisely, the 
sensitivity parameter A is expressed as A/o, rather than A itself. On using 
equation (2.17.1) for the power of the test with A = nA? /4o%, the smallest 
value of n can be determined such that 1 — B > 1 — Bo, where # 1s the actual 
value of the type II error and Bo 1s the specified one. This implies that the actual 
power is at least as large as the specified value. Note that when there are only 


16 Tet the difference between the largest treatment effect, amax, and the smallest treatment effect, 
Amin, be denoted by Amax = Qmax -- @min. For any set of aj (§ = 1,2,...,a) satisfying 
this condition, the smallest value of A given by (2.17.2) is obtained when the remaining a — 2 
treatment effects a; = (max + @min)/2. Since a;’s satisfy the constraint that yi a; = O, this 
implies that @max = A/2, @min = —A/2, and a; = 0, otherwise. 


64 The Analysis of Variance 


two treatments, the problem is equivalent to that of the two-sample, two-sided 
Student’s ¢ test. Furthermore, it has been shown by Nelson (1983) that the 
sample sizes obtained for a fixed effects analysis are generally comparable to 
those obtained using the analysis of means method. 


Remark: Tables of sample sizes using this approach were prepared by Bratcher et al. 
(1970) for a = 0.5, 0.4, 0.3, 0.25, 0.2, 0.1, 0.05, 0.01; 1 — 6 = 0.7, 0.8, 0.9, 0.95; 
A/o,. = 1.0 (0.25) 2, 2.5, 3; anda = 2 (1) 11, 13, 16, 21, 25, and 31. Nelson (1985) 
extended these tables for some additional values; thatis,a = 0.1,0.05,0.01;1—8 = 0.5, 
0.8, 0.9, 0.95; A/o, = 0.4 (0.1) 1 (0.2) 2 (0.5) 3.0; and a = 2 (1) 9. Some of these tables 
are reprinted in Appendix Table IX. A similar but more comprehensive set of tables has 
been developed by Bowman (1972) and Bowman and Kastenbaum (1975). 


Example 3. Consider a one-way layout involving three treatments 
(a = 3), a significance level of a = 0.05, and a type II error of B = 0.2. 


For an effect size of A/o, = 1.5, Appendix Table IX shows that a sample 
size of 10 for each treatment will be required. 


2.19 INFERENCE ABOUT THE DIFFERENCE BETWEEN 
TREATMENT MEANS: MULTIPLE COMPARISONS 


In an analysis of variance problem involving the comparison of a group of 
treatment means, simply stating that the group means are significantly differ- 
ent may not be sufficient. In addition, the investigator probably also wants to 
know which particular means differ significantly from others, or if there 1s some 
relation among them. For example, in many controlled experiments, the inves- 
tigator plans the experiment in order to estimate and test hypotheses regarding 
a limited number of specific quantities. The analysis of variance F test does 
not directly provide answers to these questions. New test procedures, known as 
multiple comparisons, have been developed to answer questions such as these. 
A full discussion of these procedures is beyond the scope of this volume. We 
present only a brief introduction of some of these procedures. 


Remark: For detailed discussions of the topic, the interested reader is referred to the 
survey papers by Kurtz et al. (1965), O’Neil and Wetherill (1971), Chew (1976a, b, c), 
Miller (1977, 1985), Krishnaiah (1979), Stoline (1981), and Tukey (1991), including 
books by Miller (1966, 1981), Rosenthal and Rosnow (1985), Hochberg and Tamhane 
(1987), Toothaker (1991), and Hsu (1996). Several standard textbooks on statistics 
also contain references to many of these procedures. The survey paper by O’ Neill and 
Wetherill (1971) also provides a selected and classified bibliography of the subject. 


We begin with a definition of a linear combination of means, a contrast and 
orthogonal contrasts. 


One-Way Classification 65 


LINEAR COMBINATION OF MEANS, CONTRAST 
AND ORTHOGONAL CONTRASTS 


Any expression of the form 


L = €)py + lou. +--+ + lata; (2.19.1) 
where €;’s are arbitrary constants is called a linear combination of means. 
If one adds the constraint that }°"_, £; = 0, then the linear combination is 
called a contrast of a means ;’s,i = 1, 2,..., a. The expressions jz; — 43 and 


[41 — 242 + 43 are examples of contrasts. Two contrasts L; and L» defined by 
Ly = ly py + l9M2 +--+ + Lata 
and 
Ly = €) py + bur. +-+- +l Mae 
are said to be orthogonal if 
Cl + hob’, +++ thal, = 0. 
The expressions 4; — [42 and 4; + 2 — [3 — [44 are examples of orthogonal 
contrasts. 


Given a linear combination or a contrast of factor level means defined in 
(2.19.1), we can estimate it unbiasedly by 


L = 1:91, + l090. +-+* + LaFa- (2.19.2) 
The sum of squares associated with L is defined by 


(Sen) 


Ss; = =. (2.19.3) 
¥ (8 /m) 


Ifn; =n,i = 1,2,...,a, then (2.19.3) reduces to 


(2.19.4) 


It is easy to show that the expressions (2.19.3) and (2.19.4) are general formulae 
for any sum of squares distributed as a chi-square with one degree of freedom. 


66 The Analysis of Variance 


If we assume that each of the sample means is based on the same number 
of observations, then it can be shown that a — 1 orthogonal contrasts can be 
formed using a sample means. These a — 1 contrasts form a set of mutually 
orthogonal contrasts. In addition, it can be shown that the sums of squares of 
the a — 1 orthogonal contrasts will add up to the between-group sum of squares. 
In other words, the between-group sum of squares can be partitioned into a — 1 
sums of squares each having one degree of freedom corresponding to the a — 1 
orthogonal contrasts. Furthermore, mutual orthogonality is desirable because 
it leads to independence of the a — 1 sums of squares associated with the 
orthogonal contrasts. 


Remark: When the levels of a factor are quantitative rather than qualitative in nature, 
the investigator often wants to know whether the means follow a systematic pattern or 
trend; say, linear, quadratic, cubic, and the like. In such a case, the particular contrasts of 
interest involve those measuring linear, quadratic, cubic, or other higher-order trends ina 
series of means. As usual, with a means, we have a — 1 orthogonal contrasts, each having 
one degree of freedom. Thus, for three means, we have only two orthogonal contrasts: 
the linear and quadratic. With four means, there are three orthogonal contrasts: the linear, 
quadratic, and cubic, and so on. Many books on statistical design provide coefficients 
for linear, quadratic, cubic, and other contrasts (see, e.g., Fisher and Yates, (1963)). 
The most complete set of coefficients is given by Anderson and Houseman (1942). 
This type of analysis can be used to measure the trend of factor-level means associated 
with equally spaced values of a quantitative variable. It can also be used to assess the 
trend components of means obtained at fixed intervals of times; for example, in ongoing 
surveys of population characteristics and other time series data. 


Example 1. For the data on the yields of four different varieties of wheat 
given in Table 2.3, consider the following set of three mutually orthogonal 
contrasts: 


Ly = 11 = 9, 
Lz = 3 — ba, 


L3 = fi + M2 — 3 — Me. 
The unbiased estimates of the preceding contrasts are obtained as 


— Jo, = 69 — 92 = —23, 


Ly = j3. — ¥4. = 63 — 80 = —17, 


One-Way Classification 67 


Example 1 (continued ) 
and 

L3 = y1.+y2.— Fs, — Fa, = 69 + 92 — 63 — 80 = 18. 
The corresponding sums of squares are calculated as 


— _ 6(=23)? 
“(14+ 1) 
— 6-17" 


= 1,587, 


867, 


b> Ge) ~ 


6 (18) 


— —_________ = 486, 
+1+141) 


SS;, 
Now, it is readily verified that 
SSp = SS;, + SS;, + SS;,; 


that is, 2940 = 1587 + 867 + 486. 


TEST OF HYPOTHESIS INVOLVING A CONTRAST 


Since the y;.’s are independent, the variance of (2.19.2) is given by 


Var(L) = )— ¢? Var(yi.) 
i=] 


a 
ee 2 
= 0; ) £; / nj. 
i=l 
An unbiased estimate of this variance is 


Var(L) =MSw >> ¢?/ni- (2.19.5) 
i-1l 


Inasmuch as L is a linear combination of independent normal random variables, 
it is also normally distributed. It then follows that the statistic 


(ZL — L)/,/ Var(L) (2.19.6) 


68 The Analysis of Variance 


has a Student’s ¢ distribution with N — a degrees of freedom. Therefore, a 
suitable test statistic for testing the null hypothesis 


versus 
A,:L4#Lo 
1S 
L—Lo 
t(N —a]= ——= (2.19.8) 
IMSw >> 2 /n; 
i=] 
or, equivalently, 
(L — Lo)? 
Ftl, N -—a) = ———_- (2.19.9) 


MSy > €?/n; 
i=! 


A two-sided critical region is used with the ¢ test given by (2.19.8) whereas the 
critical region for the F test given by (2.19.9) is determined by the right tail. 
Finally, a 100(1 — @) percent confidence interval for L is given by 


PiiL-w<L<L+wj=1-a, (2.19.10) 


where 


y =t[N —a,1—a/2] (2.19.11) 


and t{N — a, 1 — a/2] denotes the 100(1 — a/2)th percentage point of the tr 
distribution with N — a degrees of freedom. 


One-Way Classification 69 


Example 2. For the data on blood analysis of animals injected with five 
drugs given in Table 2.5, consider the contrast L defined by 


L = py + 22 — b3 — M4 — bs. 


An unbiased estimate of this contrast is 


A 


L = yy, +22. — y3. — ya. — Ws. = 154+ 2(4) -6-7-4=6 


A test for the hypothesis 


versus 
A, Sus Si 0 


can be carried out by calculating the value of statistic (2.19.8) with Lo = 0. 
In this example, a = 5,n, = 3,n2 = 4,3 = 3,n4 = 4,n5 = 3, and 
MSy = 6.333. Then, on substituting in (2.19.8), we obtain 


6 6 
festa ita). aaa 20 


From Appendix Table III, we find that ¢ [12, 0.975] = 2.179 so that Ho is 
sustained at a = 0.05 level of significance. 

Finally, on substituting the appropriate quantities in (2.19.10), a 95 
percent confidence interval for L is given by 


P[6 — 2.179V14.249 < L < 64+ 2.179V 14.249] = 0.95, 


P[(—2.23 < L < 14.23] = 0.95. 


Since the interval includes the value zero, we conclude that it 1s not signi- 
ficantly different from zero. Thus, the results of the confidence interval are 
in agreement with that of the ¢ test given above. 


70 The Analysis of Variance 


THE USE OF MULTIPLE COMPARISONS 


After rejecting the null hypothesis (2.7.1), one might be tempted to make a 
comparison between each pair of factor level means, that is, 1 versus 2,..., 1 
versus a, 2 versus 3,...,a@ — 1 versus a by using the test procedures (2.19.8) 
or (2.19.9). But how many such comparisons need to be made? For a factor 
levels there are (5) = s a(a — 1) pairs to be compared, although there are only 
a — | degrees of freedom for factor levels. Clearly, not all such comparisons 
are independent. Thus, it is not proper to use tests on more than a — 1 such 
comparisons. Further suppose that an experimenter wishes to compare a factor 
levels using c independent (orthogonal) contrasts. If each one of the comparisons 
is tested with the same significance level, say, wa, and if we assume that MSw 
has an infinite number of degrees of freedom (so the tests are independent), then 
when all the null hypotheses involving c comparisons are true, the probability of 
falsely rejecting at least one of them is equal to 1 —(1 —a@)© Fora = 0.05 and 
c = 5, this probability is 1 — (0.95)° = 0.2262, and for c = 10, the probability 
increases to 1 — (0.95)! = 0.4013. Thus, if we do not reject the null hypothesis 
with the initial F test, and if we then perform tests based on contrasts, we 
increase the overall probability of committing a type I error.!’ Moreover, it is 
difficult to obtain an expression equivalent to 1 — (1 —a@)*° for comparisons made 
with nonindependent (nonorthogonal) contrasts. The difficulties just described 
in connection with the test procedures (2.19.8) or (2.19.9); or the confidence 
interval (2.19.10), are resolved by using a multiple comparison method. Several 
multiple comparison procedures are available in the literature. In the following, 
we consider two such widely employed procedures known as Tukey’s method 
and Scheffé’s method, respectively. 


Tukey’s method 
According to this method, if Z is any contrast estimated by L, then a 100(1 —q@) 
percent simultaneous confidence interval for L is given by 


F 1 . 1 
L—TJn7MSy (; >> «i <L<L+TJn'MSy (; > «i ; 
i=] i=l 
(2.19.12) 


where T = g[a, a(n — 1); 1 — a] 1s the 100(1 — a@)th percentage point of the 
Studentized range distribution with parameters a and a(n — 1). (For a defi- 
nition of the Studentized range distribution, see Appendix I.) Some selected 


17 Since we are considering individual comparisons as well as sets of such comparisons, there are 
two different types of error rates or significance levels at issue. When considering individual 
comparisons, the significance level is referred to as comparisonwise error rate. When considering 
an entire set of comparisons, the significance level associated with all the comparisons in the 
set is called the experimentwise error rate. 


One-Way Classification 71 


percentage points of the Studentized range distribution are given in Appendix 
Table X. Tukey (1953) has shown that for a given value of a, the intervals 
given by (2.19.12) hold simultaneously for every possible contrast that may be 
constructed (see also Scheffé (1959, p.74)). If the interval contains the value 
zero, the contrast is said to be not significantly different from zero; whereas if 
the interval does not contain the value zero, the contrast is said to be signifi- 
cantly different from zero. Thus, to test the null hypothesis (2.19.7), we note 
whether 


ui /| Jiri (5 Sei} » gla,a(n—1);1—a@]. (2.19.13) 


Tukey’s method was originally designed for contrasts comparing two means; 
that is, L = 4, — [42, and so on. It is seldom used in practice except for this 
special contrast. For this situation, 


5 lel=5 (1J+)—1) =1, 


and the intervals given by (2.19.12) reduce to 


yi. — x. —- TV n—! MSw < wi — B; < Yi. — Ye. tT Vn! MSw. (2.19.14) 


Thus, according to the Tukey’s method, the probability is 1 — a that the intervals 
(2.19.14) contain all a(a — 1)/2 pairwise differences of the type w; —;,1 Ai. 


Example 3. To illustrate this procedure, consider the data on the yield 
of four different varieties of wheat given in Table 2.3. Here,a = 4,n = 
6, N —a = 20, and MSw = 163.600. If we let a = 0.05, then from 
the values of the Studentized range distribution given in Appendix Table 
X, g[4, 20; 0.95] = 3.58. Now, the pairwise differences of sample means 
will be compared to 


q[4, 20; 0.95] /n-! MSw = 3.58,/163.600/6 = 18.69. 


Note that there are 4(3)/2 = 6 pairs of differences to be compared. The 
four sample means are 


¥),=69, yo. =92, y3,=63, and yy, =80; 


and the 6 pairs of differences can be arranged systematically as in Table 
2.11. 


72 The Analysis of Variance 


Example 3 (continued ) 


TABLE 2.11 
Pairwise Differences of Sample Means 


Yi. — Yr. 


The differences above the dotted line exceed 18.69. The conclusions are 
that the variety IT is significantly better than I and III. There is no significant 
difference between the varieties II and IV. The probability that we have 
made one or more incorrect statements is 0.05. 


Example 4. To illustrate Tukey’s method for a more complex contrast, 
say, L = 41 + 4 — [42 — 3, we use the same data as in Example 3. Here, 


1x2 1 
pa \é;| = si+1+1+I=2, T=4l4, 20; 0.95] = 3.58, 
i=l 


T /n— MSw (; - «i = 3.58./(163.600/6)(2) = 37.39, 
i=l 


L = 69+ 80 — 92 — 63 = —6. 


Now, substituting the appropriate quantities in (2.19.12), we get a 95 
percent simultaneous confidence interval for L as 


—6 — 37.39 < L < —6+ 37.39 


—43.39 < L < 31.39. 


Since the interval includes the value zero, we conclude that L is not signi- 
ficantly different from zero. 


One-Way Classification 73 


Scheffé’s method 
According to this method if L is any contrast estimated by L, then a 100(1 — @) 
percent simultaneous confidence interval for L is given by 


a a 
£—s |(a—1)MSw ¥>@?/n; < L < h4+S |(a—1)MSy ¥°@/n;, 
i=] i=] 


(2.19.15) 


where S* = F[a — 1,N —a;1 — a] is the 100(1 — @)th percentage point 
of the F distribution with a — 1 and N — a degrees of freedom. For a given 
value of a, Scheffé (1953) has shown that the intervals given by (2.19.15) 
hold simultaneously for every possible contrast that can be constructed (see 
also Scheffé (1959, p. 69)).!8 Again, as in the Tukey’s method, to test the null 
hypothesis (2.19.7), we note whether’? 


u/| [(a — 1)MSw ea > {Fla—1,N —a;1—a}}!”. 
i=] 


(2.19.16) 


Example 5. To illustrate this procedure, we again use the data of Table 
2.3. As before, a = 4,n = 6, N —a = 20, and MSw = 163.600. 
If we again let a = 0.05, then from Appendix Table V, we find that 
S2 = F[3, 20;0.95] = 3.10. Furthermore, for the contrasts consisting of 
the differences of two means, we have 


a ne 
Dera ee = 


Now, the differences between the sample means will be compared to 


{« — 1)(163.600) (=) 3.10 = 2752, 


instead of 18.69 as in Tukey’s method. Hence, in Table 2.11, again the 
sample differences y2, — y3, and y2, — yj, are significant by the Scheffé’s 


For a simple proof of the result (2.19.15) using elementary calculus, see Klotz (1969). 
For an extension of Scheffé’s method that tests all possible sets of comparisons encompassing 
all pairs of means, all possible contrasts between groups of means, and all possible partitions of 
the means, see Gabriel (1964). 


74 The Analysis of Variance 


Example 5 (continued ) 


method. It should, however, be observed that the critical value for mean 
differences for Scheffé’s method is larger than for Tukey’s method. In 
general, for simple contrasts of this type, Tukey’s method gives shorter 
intervals and consequently finds more differences significant. 


Example 6. To illustrate the computation of a more complex contrast, 
again, as in Tukey’s method, consider L = 4; + 4 — 2 — M3. Then 


L = 69 + 80 — 92 — 63 = -6, 

: 2 
2 /n, =(1+14+14+1/6= =, 

dG /ni =A 41414 D/6 = 5 


i=] 


(a —1)MSyw >) €?/n; /G.10) {« = 1163.600)( =) 
i=l 


31.85. 


On substituting the appropriate quantities in (2.19.15), we get a 95 percent 
simultaneous confidence interval for L as 


—6 — 31.85 < L < —6+31.85 


—37.85 < L < 25.85. 


So in this case the interval given by Scheffé’s method is shorter than 
that provided by Tukey’s method. In general, for more complex contrasts, 
Scheffé’s method gives shorter intervals than Tukey’s method. 


Example 7. To illustrate further, suppose that the experimenter had 
decided before conducting the experiment that the only contrast of 
interest is L = py + [4 — [2 — 3. Then, we can get an interval for L 
using (2.19.10), which is based upon the usual ¢ distribution. Now, from 


One-Way Classification 75 


Example 7 (continued) 


Appendix Table III, t[20, 0.975] = 2.086 and, on substituting the appro- 
priate quantities in (2.19.10), the 95 percent confidence interval for L is 
obtained as 


2 ys 
—6 — 2.086 163.600) ( =) <L < —6+ 2.086 163.600)( = 


—27.78 < L < 15.68. 


Thus, the interval constructed from (2.19.10) is much shorter than the one 
given by either Tukey’s or Scheffé’s method. 


Naturally, we would expect to get a shorter interval from a procedure de- 
signed to capture one prechosen contrast than from procedures that try to catch 
all possible contrasts. However, it is quite unlikely that the experimenter would 
specify a contrast in advance. Usually, we first test the hypothesis of equal fac- 
tor level means. If this is rejected, we attempt to discover the contrasts that are 
significantly different from zero. As has been discussed earlier in this section, it 
is usually difficult to calculate the significance level associated with the several 
intervals of the type (2.19.10). Thus, we would use the intervals (2.19.12) and 
(2.19.15), given by Tukey’s method and Scheffé’s method, respectively, which 
may be computed for contrasts that are selected after the experiment is con- 
ducted and for which we can make an exact probability statement. 


Example 8. Finally, we illustrate Scheffé’s method for the case of model 
(2.1.1) with unequal sample sizes by using the data on blood analysis of 
animals injected with five drugs given in Table 2.5. Let the contrast of 
interest be defined by 


L = py + 2u2 — b3 — ba — Ms 


which is estimated by 


L=154+2(4)-6-—7 —4=6. 


Furthermore, in this case, we have a =5,n, =3, no = 4, n3 =3,n4=4, 
ns = 3, and MSw = 6.333. For a = 0.05, we find from Appendix Table V 


76 The Analysis of Variance 


Example 8 (continued ) 


that S* = F[4, 12;0.95] = 3.26. Now, for the given contrast, we get 


pe fesse a eae 
ar as 


i=] 


S |(a—1)MSyw > @?/n; = /G.26) Ho - 16.333)(2)| 13,63. 
i=] 


On substituting the appropriate quantities in (2.19.15), the 95 percent 
simultaneous confidence interval for L is obtained as 


6.00 — 13.63 < L < 6.00+ 13.63 


—7.63 < L < 19.63. 


Note from Example 2 that this interval is larger than the interval based on 
the usual ¢ distribution. Again, the interval includes zero, and hence the 
contrast is not significantly different from zero. 


Interpretation of Tukey’s and Scheffé’s methods 

Suppose one were to calculate the confidence intervals for all conceivable con- 
trasts by using (2.19.12) or (2.19.15). Then, according to Tukey’s and Scheffé’s 
methods, the entire set of confidence intervals would be correct in 100(1 — @) 
percent of repetitions of the experiment. Note that this is a somewhat different 
interpretation than one from the ordinary confidence interval. When we make a 
95 percent confidence interval for, say, a single parameter 0, we are correct in 
saying that if we took all possible random samples of size n and calculated the 
interval for each, the interval would cover the true value of @ in 95 percent of the 
cases. For the multiple comparisons, however, we are referring to all possible 
comparisons that might be made on a given set of data, and the probability 
statement is about the event of all such intervals computed from a set of data 
covering the corresponding true values. It is, therefore, important that the initial 
F test be significant giving us prior reason to believe that reliable departures 
from the hypothesis exist. These differences are to be found among the possible 
comparisons. 


One-Way Classification 77 


Remark: Researchers who have used Tukey’s and Scheffé’s methods have occasionally 
been somewhat surprised to find that a significance of the overall analysis of variance 
F test has not led to at least one significant contrast. We expect that if the overall 
test is significant at the w-level, then at least the maximum possible contrast will also 
be significant at the a-level. Unfortunately, the maximum possible contrast may have 
been of little interest, and, therefore, may not have been computed. There is no guar- 
antee that the obvious contrasts (i.e., the differences within pairs of means) or the con- 
trasts most interesting to the experimenter will be significant when the overall F test is 
significant. 


Comparison of Tukey’s and Scheffé’s methods | 
In the following, we provide relative merits and drawbacks of Tukey’s and 
Scheffé’s methods of multiple comparisons. 


1. The Tukey’s method can be used only with equal sample sizes for all 
factor levels, but the Scheffé’s method is applicable whether the sample 
sizes are equal or not. 

2. Although the Tukey’s method is applicable for any general contrast, 
the procedure is most powerful when comparing simple pairwise dif- 
ferences and not when making more complex comparisons. 

3. If only pairwise comparisons are of interest, and all factor levels have 
equal sample sizes, Tukey’s method gives shorter confidence intervals 
and thus is more powerful. 

4. Inthecase of comparisons involving general contrasts, Scheffé’s method 
tends to give narrower confidence limits, and thus provides a more 
powerful significance test. 

5. The Scheffé’s method has the property that if the F test produces sig- 
nificant results, then the corresponding Scheffé’s multiple comparison 
will detect at least one statistically significant contrast from all possible 
contrasts. Thus, we are able to draw more conclusions than merely that 
all factor level means are different. 

6. The Scheffé’s method requires the use of the tables of the F distribution 
which are more readily available than the tables of the Studentized range 
distribution used by the Tukey’s method. 

7. The Scheffé’s method is less sensitive to violations of normality and 
homogeneity of variance assumptions than is the Tukey’s method. 


OTHER MULTIPLE COMPARISON METHODS 


In addition to Tukey’s and Scheffé’s methods described previously, there are a 
number of other multiple comparison procedures that are used widely in many 
substantive fields of research. In the following, we briefly outline some other 
common procedures for making a post hoc comparison. 


Least significant difference test 
The test also commonly referred as the protected least significant difference 
(LSD) was originally proposed by Fisher (1935). The test is carried out in steps 


78 The Analysis of Variance 


as follows: 


1. First, an overall F test at a given level of significance (@) is carried 
out to determine whether there are significant differences among the 
treatment groups. 

2. Only if the F test in step 1 1s significant, pairwise comparisons among 
treatments are performed using a f test at level a. 


Assuming a two-sided alternative, the pair of means jz; and jz; would be declared 
significant if 


] ] 
lv. — Ww. > tLN —a,1— a/2],|MSw (- + ~). (2.19.17) 
nN; nj 


The quantity to the right of the inequality in (2.19.17) is called the least signif- 
icant difference. If the design is balanced, that is, ny} = np = --- =ng = 7, 
then it reduces to t[a(n — 1), 1 — w@/2]./2MSw/n. To use the LSD procedure, 
one need simply compare the observed differences between each pair of sample 
means to the corresponding least significant difference. If the sample difference 
exceeds this quantity, one may conclude that the pair of means are significantly 
different. 


Remark: The use of the preliminary F test in the LSD procedure helps to protect the 
overall error rate under the null hypothesis of no differences in the set of treatment 
effects. However, even if a single comparison differs significantly from zero, the LSD 
approach does not protect against finding chance differences among other comparisons. 


Bonferroni’s test 

The test is based on the principle that if there are k null hypotheses to be tested, 
then a desired overall error rate of at most a can be achieved by testing each 
null hypothesis at level w/k. Equivalently, if there are k confidence intervals 
each constructed at confidence level 100(1 — @/k) percent, then they all hold 
simultaneously with confidence level of at least 100(1 — @) percent. To see 
how this works, suppose that each null hypothesis is tested at level @* and let 
E; denote the event that the i-th null hypothesis is rejected. Then, the overall 
probability of a type I error (@) 1s given by 


a= P{E,UE,U ---UE,} 
< P(E\)+ P(E2) +--+ + PCE) 
= ka*. 
Thus, if each one of the & null hypotheses is rejected at level w@/k, the overall 


error rate 1s at most aw. The procedure is known as the Bonferroni’s method, 
since it 1s based on the Bonferroni or Boole inequality. Sometimes it is also 


One-Way Classification 79 


referred to as Dunn’s multiple comparison procedure following Dunn (1961) 
who examined the properties of the procedure in detail and prepared tables to 
facilitate its use. The method is fairly simple and versatile and gives reasonably 
good results if k is not very large. The procedure tends to be somewhat conser- 
vative, that is, true confidence levels tend to be greater than 1 — a. The method 
should be used when there are only few comparisons to be made and none of 
the other procedures are appropriate.” Special percentage points of the ¢ distri- 
bution for very small values of a are usually required to determine Bonferroni 
intervals. Specially designed tables for this purpose are given in Dunn (1961), 
Pearson and Hartley (1970, Table 9), Bailey (1977), Miller (1981, p. 238), and 
Kafadar and Tukey (1988). Moses (1978) provides charts for finding upper per- 
centage points for a in the range of 0.01 to 0.00001. Koehler (1983) gives a fairly 
simple and accurate approximation for the extreme percentiles of the ¢ distri- 
bution. Many statistical packages and other computer programs have standard 
routines for calculating percentage points. Some selected percentage points of 
the ¢ distribution to determine Bonferroni intervals are given in Appendix Table 
XIII. For some further discussions and details about the Bonferroni statistic, 
see Dunn (1959, 1961) and Dunn and Massey (1965). 


Remark: Holm (1979) introduced a modified Bonferroni test that consists of a class of 
sequentially rejective Bonferroni (SRB) procedures which results in greater power than 
the Bonferroni’s test. Under Holm’s SRB criterion, if any hypothesis is rejected at the 
level a* = a/k, then the denominator of a* for the next test is k — 1, and the criterion 
continues to be modified in a stepwise manner, with the denominator of a* decreased 
by 1 each time a hypothesis is rejected, so that tests can be conducted at successively 
higher significance levels. The experimentwise error rate of the SRB procedures is <a@ 
as is that of the standard Bonferroni procedure. Shaffer (1986) introduced a refinement 
of Holm’s SRB test known as the modified sequentially rejective Bonferroni (MSRB) 
test which is at least as powerful as the Holm’s test while maintaining an experimentwise 
error rate <a. 


Dunn-Sidak’s test 

According to Bonferroni’s or Dunn’s test, given k comparisons or contrasts 
each to be tested at the level w*, the overall error rate (aw) cannot exceed ka*. For 
small values of @, it provides an excellent approximation to the upper bound. 
However, an even better approximation to the upper bound can be obtained by 


20 Fleiss (1986, pp. 106-107) made the following recommendations regarding the use of the 
Bonferroni method. It should be preferred to Scheffé if the number of comparisons is less than 
a’; it should be preferred to Tukey if fewer than all a(a — 1)/2 comparisons are needed or if a 
relatively small number of other comparisons are to be made; and it should be preferred over 
the Dunnett if the comparisons of interest are other than or in addition to those between each of 
several treatments and a control. 


80 The Analysis of Variance 


a multiplicative inequality given by Sidék (1967). It can be shown that?! 
a<1—(l1—a*y¥ < ka’. 


Thus, instead of testing each contrast at the a/k level of significance, as in 
Bonferroni’s or Dunn’s method, each contrast can be tested at the 1 — (1 —a@)!/* 
level of significance’. In general, the use of the Dunn-Sidak’s method requires 
a slightly smaller critical value and hence leads to a more powerful test and 
a narrower confidence interval than the Bonferroni’s method. For example, 
let w@ = 0.05 and suppose there are five contrasts to be tested. Now, a/k = 
0.05/5 = 0.01 and 1 — (1 — a@)!/* = 1 — (1 — 0.05)!” = 0.0102. Thus, 
the difference between the two significance levels is negligible. Games (1977) 
developed tables of critical values of the ¢ statistic for use with the Dunn-Sidak’s 
method. The values are reprinted in Appendix Table XIV. Dunn (1961) and 
Games (1977) made comparisons of Bonferroni and Dunn-Sidak procedures 
with Tukey and Scheffé procedures and found that when there are many means 
in an experiment and the number of comparisons of interest is relatively small 
compared to the number of means in the experiment, the Bonferroni and Dunn- 
Sidak procedures yield shorter confidence intervals than either the Tukey or 
Scheffé procedure. 


Newman-Keuls’s test 

The test also known as the Student-Newman-Keuls’s test was first proposed by 
Newman (1939) and subsequently popularized by Keuls (1952). The procedure 
follows a predetermined criterion for grouping means into subsets and adjusts 
the overall error rate w according to the number of means to be tested. It uses 
the Studentized critical range values determined by 


MSy 
Wa) = gla, a(n — 1); 1 —a@] a 


where g[a, a(n—1); 1—a] is the 100(1 —a)th percentage point of the Studentized 
range distribution with parameters a and a(n — 1). The procedure consists of ar- 
ranging the a means jj., y2.,..., Ya, In ascending order as Wy < W2) < °° < 
Ya). It then divides the group means into mutually exclusive subsets so that 
means within a subset are not significantly different and the means from distinct 
subsets are different. For this purpose, the quantity Wa) — y1), called the range of 
the set of means, 1s calculated. Now the observed range yq) — \1) 1s compared 
with W,q). If it is less than W(q), the procedure stops and we conclude that the 
a means are not significantly different. If Ya) — Yay 1s greater than Wig), we 


21 This result was earlier proved by Dunn (1958) for certain special cases. 

22 Assuming k independent significance tests, each using a significance level of a*, the probability 
of not committing type I error in any of the k tests is (1 — a*)*. The overall or experimentwise 
error rate (i.e., the probability of committing at least one type I error rate) is then given by 
a = 1 —(1 —a*)*. The solution for a* yields a* = 1 — (1 — a)!/*. 


One-Way Classification 81 


divide ¥(1), ¥2),---, (a) Into two groups, one containing Wa), Wa—1), «++» Y(2)> 
and the other containing ¥q—1), Y(a—2), ---» Yc). Next, each of the ranges in two 
subgroups, ViZ., Ya) — ¥2) and ¥a—1) — Vay, 1s compared with Wi,_1) determined 


by 
MSw 
Wa-1) = gla = 1, a(n — 1); 1 = a] — 4 
n 


If either range does not exceed Wig_1), then the means in each of the two 
groups are not significantly different and the procedure stops. If either or both 
ranges exceed W,,_1), then the a — 1 means in the corresponding group(s) are 
further divided into two groups of a — 2 means each and the ranges for these 
groups are compared with W(g—2). The procedure is continued until a group of 
i means is found whose range does not exceed W,;) or all the means have been 
compared. 


Duncan’s Multiple Range test 

This test developed by Duncan (1952, 1955) also ranks the group means by 
magnitude and then obtains subsets of means that are not significantly different. 
The method adjusts its overall error rate for each comparison rather than a 
prechosen level based on the total number of group means to be determined. 
The procedure is carried out exactly the same way as the Newman-Keuls’s test 
except that the observed ranges are now compared with Duncan’s critical range 
values determined by 


MS 
Dia) = Ria, a(n — 1);1 — a], —, 
n 


where R[a, a(n — 1); 1 — a] denotes the 100(1 — a@)th percentage point based 
on Duncan’s multiple range distribution with parameters a and a(n — 1). Some 
selected percentage points of Duncan’s multiple range distribution are given 
in Appendix Table XII. The procedure has been found to be somewhat less 
conservative than Newman-Keuls’s test. 


Dunnett’s test 

The test developed by Dunnett (1955) is especially designed for experiments 
that include a control group and the researcher wishes to compare all the re- 
maining group means with the control. The procedure is a simple modification 
of the usual t test with the change that the differences between means involving 
the control group |¥;. — y-|,i = 1,2,...,a—1 are compared with the critical 
ranges determined by 


l 1 
D{a — 1,a(n — 1);1—a] MSw (— + ~), 
Nj Nc 


82 The Analysis of Variance 


where jy, is the mean of the control group and D[a — 1, a(n — 1);1 — @] 
denotes the 100(1 — a)th percentage point of the Dunnett distribution with 
parameters a — 1 and a(n — 1). Some selected percentage points of the Dunnett 
distribution are given in Appendix Table XI. Since the test does not make 
any comparison among the noncontrol groups, it is generally more powerful 
than other procedures for comparing a control group with other groups. It is 
important to point out that the Dunnett procedure should be used when the only 
comparisons of interests are the individual treatments against the control.” 


Remark: The SAS PROBMC function computes probabilities or quantiles from the 
one-sided or two-sided distribution of the Dunnett’s statistic with finite and infinite 
degrees of freedom for the variance estimate. For futher information and numerical 
examples of PROBMC function, see SAS Institute (1997, Chapter 28). 


MULTIPLE COMPARISONS FOR UNEQUAL SAMPLE SIZES AND VARIANCES 


Most of the multiple comparison procedures presented so far are appropriate for 
designs involving equal sample sizes and assuming equal population variances. 
We now summarize some of the procedures designed to be used with unequal 
sample sizes and variances. 


Unequal sample sizes 

For pairwise comparisons, Tukey (1953) and Kramer (1956, 1957) proposed a 
modification of Tukey’s method where a harmonic mean of n; and n; is inserted 
for n in equation (2.19.14). The resulting intervals known as Tukey-Kramer 
intervals are given by 


1 


1 1 
yi. — Wr. — gla, N —a;1—a],/- (— + =) MS < pj — pi 
2\n; nj 


2 nj 


< y; — yy. t+ qla, N —a;l—a] =(-+=) MSy. (2.19.18) 
l t 
Remark: Winer (1962; 1971, p. 216) and Miller (1966, p. 43; 1981, p. 43) proposed 
the idea for general type contrasts where the harmonic mean of unequal n;’s (@ = 
1,2,...,a) is substituted for m in equation (2.19.12). Simulation studies by Dunnett 
(1980a) showed that for pairwise comparisons Tukey-Kramer intervals (2.19.18) provide 
approximate probability coverage. Later, Hayter (1984) proved analytically that the 
probability coverage is always conservative (> 1 — a). However, Tukey-Kramer-Miller- 
Winer procedures are not robust to unequal variances (see, e.g., Howell and Games 
(1973); Keselman et al. (1975); Keselman and Rogan (1978)). 


23 If the researcher is interested in comparing the combination of groups to the control group, 
Scheffé’s test or a generalization of Dunnett’s test proposed by Shaffer (1977) may be used. 


One-Way Classification 83 


For pairwise comparisons involving k simultaneous intervals, the Bonferroni 
intervals are given by 


at. tt 
yj. — yy. — tN — a, 1 — a /2k],|MSy (— + a < pi pi 


< 5; — ye + t[N —a, 1 —a/2k] [aisw ( — 


In the one-way classification involving all pairwise comparisons k is a(a— 1)/2. 
However, occasionally k would be less if some mean comparisons a priori are 
not of interest. 

For pairwise comparisons, Hochberg (1974) proposed a modification of 
Tukey- Kramer intervals given by 


| | 
yi. — We. — m[a(a — 1)/2, N —a;1—a],/MSy (— + ae < [hi — 
n n 


< jj. — 9. + mla(a — 1)/2, N —a;1—a] [asw (— 


where m[p, v, 1 — a] is the 100(1 — @)th percentage point of the Studentized 
maximum modulus distribution for p = a(a — 1)/2 pairwise means with v = 
N — a degrees of freedom for the error. (For a definition of the Studentized 
maximum modulus distribution, see Appendix J.) Some selected percentage 
points of the Studentized maximum modulus distribution are given in Appendix 
Table XV. The studies by Tamhane (1979) and Dunnett (1980a) have shown that 
the procedure is not robust to unequal variances involving unequal sample sizes. 
Similarly, Spjotvoll and Stoline (1973) proposed the intervals 


; 1 1 
ii, — jv. —q'la, N —a;1—a]/MS —, 


< pj — by < Vi, — We. + q'[a, N — a; 1 — a) ~MSyw max 


ya va) 


where q’[p, v; 1—a] is the 100(1 —@)th percentage point of the Studentized aug- 
mented range distribution for p means with v degrees of freedom for the error. 
Tables of g’[p, v; 1 — a] have been prepared by Stoline (1978). Some selected 
percentage points of the Studentized augmented range distribution are given in 
Appendix Table XVI. These intervals, however, are conservative in the sense 
that the true coverage probability is greater than or equal to 1 — a. 


Remark: Ury (1976) studied some of the foregoing procedures including Scheffé and 
Dunn-Sidak intervals and found that the choice of the ‘“‘best interval” depends upon the 


84 The Analysis of Variance 


particular combination of sample sizes, significance level, number of groups, and the 
error degrees of freedom. Stoline (1981) made a detailed comparison of all the foregoing 
procedures and recommended the general use of the Tukey-Kramer intervals. 


Unequal population variances 
For pairwise comparisons, Games and Howell (1976) proposed a procedure 


where MSy in equation (2.19.18) is replaced by ,/S?/n; + Si [nj for the dif- 


ference involving the (i, i’)-pair of means. The error degrees of freedom N — a 
in g{a, N — a; 1 — qa] 1s further replaced by 


(S?/ni + S/nv)? 
= = oS oe a, oe 
[(S?/ni)° ri — 1) + [(S2/nv) re — 1] 


Extensive Monte Carlo studies by Tamhane (1979) and Dunnett (1980b) have 
shown that the procedure can give nonconservative a values, as high as 0.84, 
with unequal variances. For a modification of (2.19.18) based on the Studentized 
augmented range distribution, see Hochberg (1976). 

Dunnett (1980b) suggested a modification of the foregoing procedure in 
which the critical value qg[a, v;, 7; 1 — a] is replaced by 


qla,n; — 1;1 — o](S? /n;) + gla, nj — 131 — a}(S7 /ni') 


(S?/mi) + (Si /nv) 


It should be noted that the critical value given above corresponds to Cochran’s 
(1964) approximate solution to the Behrens-Fisher problem and thus the pro- 
cedure is expected to be conservative (Dunnett (1980b)). 

Dunnett (1980b) proposed another modification that utilizes the same statistic 
as in Games and Howell but the critical value q[a, v; ;,; 1 — aw] is replaced by the 
critical value m[a(a — 1)/2, v;,;7; 1 — a] of the Studentized maximum modulus 
distribution. Dunnett (1980b) found that the procedure is also conservative with 
unequal variances. 


Remark: Weerahandi (1995) proposed a modification of the Scheffé’s procedure of 
multiple comparison given by (2.19.16) to the case of unequal variances. The procedure 
is, however, too complicated and mathematically intractable for the practitioner to use 
in routine work. 


2.20 EFFECTS OF DEPARTURES FROM ASSUMPTIONS 
UNDERLYING THE ANALYSIS OF VARIANCE MODEL 


In making inferences from the analysis of variance model (2.1.1), we have made 
the following assumptions: 


(1) e;;’s are normally distributed; 


One-Way Classification 85 


(11) e;;’s have same variance oa}; and 
(111) e;;’s are independently distributed. 


It stands to reason that in any real-life applications none of the preceding as- 
sumptions can be expected to be completely satisfied. One rarely draws inde- 
pendent random samples from populations that are exactly normally distributed 
with precisely equal variances. The question naturally arises: What are the ef- 
fects of any departure from the assumptions of the model on the inferences 
made? For a thorough discussion of the topic, the reader is referred to Scheffé 
(1959, pp. 331-369), Miller (1986, Chapter 3), and Snedecor and Cochran 
(1989, Chapter 15). Here, we briefly summarize some of the main findings. 


DEPARTURES FROM NORMALITY 


For Model I, many investigations have been made to study the effect of nonnor- 
mality on both the level of significance and the power of the F test employed in 
the analysis of variance. Both analytic results (see, e.g., Scheffé (1959, pp. 345-— 
351)) and empirical studies by Pearson (1931), Geary (1947), Gayen (1950), 
Box and Anderson (1955), Boneau (1960, 1962), Srivastava (1959), Bradley 
(1964), Tiku (1964, 1971), and Donaldson (1968) attest to the fact that the 
failure to satisfy this assumption has little effect on the F test. Thus, if depar- 
ture from normality is not too extreme, the lack of normality does not present 
any serious problem, since the means will follow the normal distribution more 
closely than the variates themselves. Both the level of significance and the 
power of the F test are only slightly affected by any departure from normality. 
However, extreme nonnormality may result in a biased test. In this connection, 
it is important to mention that any departure from the kurtosis of the normal 
distribution (either more or less peaked) 1s much more serious than the skewness 
of the distribution in terms of the effects on inferences. Also, platykurtic (flat) 
and leptokurtic (peaked) distributions have little effect on the significance level 
but can have a marked effect on power, particularly when the sample sizes are 
small. Furthermore, only highly skewed distributions would have any marked 
effect either on the level of significance or the power of the F test. 

The point estimates of the factor level means and their contrasts are unbi- 
ased irrespective of whether populations are normal or not. Hence, the F test 
is generally robust against any departures from normality (in skewness and/or 
kurtosis) if sample sizes are large or even if moderately large. For instance, the 
nominal level of significance might be 0.05 whereas the actual level for a non- 
normal population might vary from .044 to .052 depending on the sample size 
and the magnitude of the kurtosis (Box and Anderson (1955)). Generally, the 
actual level of significance in the presence of positive kurtosis (platykurtic) is 
slightly higher than the specified one and the real power of the test for positive 
kurtosis is slightly higher than the normal one. If the underlying population 
has negative kurtosis (leptokurtic), the actual power of the test will be slightly 
lower than the normal one (Glass et al. (1972)). Single interval estimates of the 


86 The Analysis of Variance 


factor level means and contrasts and some of the multiple comparison methods 
are also not much affected by the lack of normality provided the sample sizes 
are not too small. The robustness of multiple comparison tests in general has 
not been as thoroughly studied. Among the few studies in this area is that of 
Brown (1974). A number of studies, however, have investigated the robustness 
of several multiple comparison procedures, including Tukey and Scheffé, for 
exponential and chi-square distributions and found little effect on both signif- 
icance level and power (see, e.g., Petrinovich and Hardyck (1969); Keselman 
and Rogan (1978)). Dunnett (1982) reported that Tukey is conservative both 
with respect to significance level and power for long-tailed distributions and 
to outliers. Similarly, Ringland (1983) found that the Scheffé was conservative 
for distributions with influence to outliers. 

For Model II, the lack of normality has more serious implications than 
Model I. The estimates of the variance components are still unbiased, but their 
variances depend on the kurtosis of the distribution and the actual confidence 
coefficients for interval estimates of 02,02, 02/02 may be substantially dif- 
ferent from the specified one (Singhal and Sahai (1992)). Furthermore, when 
testing the null hypothesis that the variance of a random effect is some specified 
value different from zero, the test is not robust to the assumption of normality. 
For some illustrations and numerical results, the reader is referred to Arvesen 
and Schmitz (1970) and Arvesen and Layard (1975). However, if one is con- 
cerned only with a test of the hypothesis o2 = 0, then slight departures from 
normality have only minor consequences for the conclusions reached when the 
sample size is reasonably large (see, e.g., Tan and Wong (1980); Singhal and 
Singh (1984); Singhal et al. (1988)). 


DEPARTURES FROM EQUAL VARIANCES 


Both the analytical derivations by Box (1954a) and the empirical studies cited 
earlier indicate that if the variances are unequal, the F test for the equality of 
means under Model I is only slightly affected with respect to moderate vio- 
lations of this assumption provided the sample sizes do not differ greatly and 
the parent populations are approximately normally distributed”* (Glass et al. 
(1972)). Generally, unequal error variances increase the actual level of signif- 
icance slightly higher than the specified level and result in a slight elevation of 
the power function to a degree related to the magnitude of differences among 


24 When the variances are unequal, an approximate test similar to the approximate ¢ test when two 
group variances are unequal may be used (Welch (1956)). For a description of the test and some 
illustrative examples, see Zar (1996, pp. 189-190). The method has been shown to perform rather 
well when population variances are unequal (Kohr and Games (1974); Levy (1978a); Dijkstra 
and Werter (1981)). For some other approaches to analysis of variance involving heterogeneous 
variances, see James (1951), Brown and Forsythe (1974a,b), Bishop and Dudewicz (1978), 
Clinch and Keselman (1982), Krutchkoff (1988), Wilcox (1988, 1993), and Alexander and 
Govern (1994). For a survey and comparisons of traditional ANOVA alternatives with other 
alternative procedures, see Coombs et al. (1996). 


One-Way Classification 87 


variances (Box (1954a)). If larger variances are associated with larger sample 
sizes, the level of significance will be slightly less than the nominal value, and 
if they are associated with smaller sample sizes, it will be slightly greater than 
the nominal value (Horsnell (1953); Kohr and Games (1974)). Similarly, if the 
sample sizes do not differ greatly, the Scheffé’s method for multiple compari- 
son is only slightly affected due to any lack of homogeneity of error variances. 
Thus, the F test and related procedures are fairly robust against any violation 
from equal error variances provided the sample sizes are nearly equal. Compar- 
isons of factor level means based on a single contrast, however, are significantly 
affected by unequal variances even when samples sizes are equal. 

On the other hand, when different numbers of cases appear in various sam- 
ples, even relatively small departures from the assumption of homogeneous 
variances can have very serious consequences for the validity of the final in- 
ference (see, e.g., Scheffé (1959, p. 351); Welch (1956); James (1951); Box 
(1954a); Brown and Forsythe (1974a); Bishop and Dudewicz (1978); Tan and 
Tabatabai (1986)). According to Box (1954a), for samples of unequal sizes, 
even a small violation of this assumption can have a marked effect on the level 
of significance. The actual significance level will exceed the nominal level when 
smaller samples are drawn from more heterogenous populations and will be less 
than the nomimal value when the smaller samples are drawn from more ho- 
mogeneous populations. Furthermore, Rogan and Keselman (1977) found that 
the actual significance level may be appreciably larger when the variances are 
quite heterogenous. Moreover, the effect of unequal variances is not apprecia- 
bly reduced simply by increasing the samples sizes, as long as the ratios of the 
sample sizes remain unchanged. 

Krutchkoff (1988) made an extensive simulation study in order to determine 
the size and power of several analysis of variance procedures, including the F 
test, Kruskal-Wallis test and a new procedure called the K test. It was found 
that both the F test and the Kruskal-Wallis test are highly sensitive whereas the 
K test is relatively insensitive to the heterogeneity of variances. The Kruskal- 
Wallis test, however, is not as sensitive to the unequal error variances as the F 
test; and was found to be more robust to nonnormality (when the error vari- 
ances are equal) than either the F test or the K test. A more recent study by 
Lix et al. (1996) seems to indicate that violations of the variance homogeneity 
assumptions can have serious consequences for control of the type I error rate 
regardless of whether group sizes are equal or unequal, but particularly in the 
latter case. The study also found that all the parametric alternatives of the anal- 
ysis of variance test had superior performance when the variance homogeneity 
assumption was violated. Furthermore, when the group sizes were equal, the 
effect of nonnormality on the type I error rate of the F test was no different when 
variances were equal than when they were unequal. The error rates remained 
close to the nominal level regardless of the degree of nonnormality when vari- 
ances were equal and were always inflated across the nonnormal distributions 
when variances were unequal. The pattern was also evident when group sizes 


88 The Analysis of Variance 


were unequal. Thus, whenever possible, the experimenter should try to achieve 
a nearly equal number of cases in each factor level unless the assumption of 
equal population variances can reasonably be assured in the experimental con- 
text. It should be observed that the use of equal sample sizes for all factor levels 
not only tends to minimize the effects of unequal variances using the F test, 
but also simplifies the computational procedure. 

For Model II, the effect on the robustness of the F test is the same as for the 
fixed effects model. For balanced designs the effects are minimal but can have 
serious effects for unbalanced designs. However, the lack of homoscedasticity 
Or unequal error variances can have serious effects on inferences about the 
estimation of variance components even when all factor levels contain equal 
sample sizes. 


DEPARTURES FROM INDEPENDENCE OF ERROR TERMS 


Lack of independence can result from biased measurements or possibly from 
a poor allocation of treatments to experimental units. Departure from indepen- 
dence could also arise in an experiment in which experimental units or plots are 
laid out in a field so that adjacent plots give similar yields. Lack of independence 
can likewise result from correlation in time rather than in space. Thus, the most 
frequent violation of independence assumption occurs when the observations 
are recorded over some time-space coordinate in which adjacent observations 
tend to be correlated. Nonindependence of the error terms can have important 
effects on inferences for both Models I and II. If this assumption is not met, both 
the level of significance and the power of the F test may be strongly affected 
and very serious errors in inferences can be made (Scheffé (1959, p. 945)). The 
direction of the effect depends on the nature of the dependence of the error 
terms. In most cases encountered in practice, the dependence tends to make 
the value of the ratio too large and consequently the significance level will be 
smaller than it should be (although the opposite can also be the case). Thus, 
positive correlations among the error variances within a factor level may cause 
too many significant results based on the F test and the effect on the ¢ test may 
be even greater. 

Since the violation of this assumption is often difficult to remedy, every 
possible effort should be made to obtain independent random samples. The use 
of ramdon sampling or randomization in various stages of the study can be a 
most important protection against independence of error terms. In general, great 
care should be taken to see that the data are based on independent observations, 
both between and within groups; that is, each observation is in no way related 
to any of the other observations. Although dependency among the error terms 
creates a special problem in any analysis of variance, it is not required that 
the observations themselves be completely independent for Model II to apply. 
However, just as Model I is not robust to the assumption of independence, 
Model II is also not robust to this assumption. Violation of this assumption 


One-Way Classification 89 


generally results in declaring too many significant results in the F test. The 
effects on various point and interval estimates of a2, however, are unknown. 


2.21 TESTS FOR DEPARTURES FROM ASSUMPTIONS 
OF THE MODEL 


As we have seen in the preceding section, the analysis of variance procedure 
is robust and can tolerate certain departures from the specified assumptions. It 
is, nevertheless, recommended that whenever a departure is suspected it should 
be investigated. In this section, we briefly discuss the tests for normality and 
homoscedasticity. 


Remark: Before carrying out the formal statistical procedures for testing normality and 
homogeneity described here, it may be fairly informative and useful to explore the data 
graphically. For example, one can use box-plots for different groups and/or within group 
histograms to see if the distribution of values in each group is symmetric and free of 
any gross outliers and other anomalies in the data; and if the spread of the data across 
groups is fairly constant. Ideally, if the analysis of variance assumptions are satisfied, 
box-plots should be symmetric and the spreads across the groups should be nearly the 
same. However, in many practical problems involving small sample sizes, the skewness 
and homogeneity may be difficult to evaluate in this way. Box-plots are discussed in most 
introductory statistics textbooks, or one may refer to a book on exploratory data analysis 
(see, e.g., Tukey (1977); Chambers et al. (1983); Hoaglin et al. (1983); Cleveland 
(1985)). Box-plots and related graphical techniques are also available in most statistical 
packages currently being used for data analysis. For a discussion of analysis of variance 
from the viewpoint of exploratory data analysis, see Hoaglin et al. (1991). 


TESTS FOR NORMALITY 


A relatively simple technique to determine the appropriateness of the assump- 
tion of normality is to graph the data points on a normal probability paper. 
If a straight line can be drawn through the plotted points, the assumption of 
normality is considered to be reasonable. 

We now consider some formal tests for normality. They are the chi-square 
goodness-of-fit test, and the tests for skewness and kurtosis which are often 
used as supplements to the chi-square test. | 


Chi-square goodness-of-fit test 

In this test the data are grouped into classes to form a frequency distribution and 
the sample mean and standard deviation are calculated. From these quantities a 
normal distribution is fitted and expected frequencies in each class are obtained. 
Let 0; and e; represent the observed and expected frequencies for the i-th class. 


90 The Analysis of Variance 
Then the test criterion 1s based on the quantity 


X= Yo; —e;)*/e;, (2.21.1) 


I 


where the summation is taken over all the classes. If the data actually come 
from a normal distribution, then the quantity (2.21.1) follows approximately a 
chi-square distribution with k — 3 degrees of freedom, where k is the number 
of classes used in the calculation of X7. If the data come from some other 
distribution, the observed 0; will tend to agree poorly with the values of e; that 
are expected on the assumption of normality and the calculated value of X? 
will become large. Consequently, large values of X* lead to the rejection of 
the hypothesis of normality. Thus, if the calculated value of the statistic X? 
exceeds x’[k — 3, 1 — a], the 100(1 — @)th percentage point of the chi-square 
distribution with k — 3 degrees of freedom, we reject the null hypothesis that 
the sample is selected from a normal population. 

For the validity of the chi-square test, it is required that the expected frequen- 
cies e; should not be too small. Small expected values are likely to occur only 
in the extreme classes. A working rule is that two extreme expectations may be 
each as low as 1, provided that most of the other expected values exceed 5. If the 
expected values are lower than 1, classes are combined to give an expectation 
of at least 1. For a more detailed discussion of these questions refer to Cochran 
(1954), Larntz (1978), and Koehler and Larnzt (1980). 


Test for skewness 

One indication of nonnormality occurs when the relative frequency histogram 
for the sample data is highly skewed to either the left or right. A measure of 
amount of skewness is given by /23, the third moment about the population mean, 
which is the average value of (x — jz)* taken over the population. The skewness is 
positive or negative according to the sign of 23. If low values are clustered close 
to the mean pz but high values extend far above the mean, 13 will be positive since 
the large positive contributions of (x — j2)° when x exceeds jz will predominate 
over the smaller negative contributions of (x — jz)? obtained when x is less than 
wt. Similarly, 43 will be negative when the lower tail is the extended one. The 
meaning of positive and negative values of {43 1s illustrated in Figure 2.6 


The actual measure of skewness is given by the coefficient of skewness 
defined as 


_ 3 _ B3 
Vie a5. 
ag) 


(2.21.2) 


o3 


The quantity (2.21.2) is independent of the measurement scale and can be 


One-Way Classification 91 


fly) fy) fly) 


¥,;> 0 ¥; < 0 ¥, = 90 


FIGURE 2.6 Curves Exhibiting Positive and Negative Skewness and Symmetrical 
Distribution. 


estimated from the sample data by 


A= (2.21.3) 


where 


m3 = dO —yy/n and m= 20% — y)*/n. 


(= 


A test of the null hypothesis that the sample data are selected from a normal 
population can be based on the statistic 


A 


Y1 


J6/n- 


where Z is a standard normal variate. The assumption that Z has approximately 
a standard normal distribution is accurate enough for this test if n exceeds 150. 
For sample sizes between 25 and 200 the one-tailed 5 percent and 10 percent 
significance values of /, have been determined from a more accurate approx- 
imation and appear in Pearson and Hartley (1970). Some of these values are 
reprinted in Appendix Table X VII(a). 


Z= (2.21.4) 


Test for kurtosis 

A second kind of departure from normality can be detected by examining the 
kurtosis of the distribution. The kurtosis of a distribution is measured by the 
quantity 


M4 bE 
w= SH, (2.21.5) 
no) 0 


where y2 is called the coefficient of kurtosis. Unlike the coefficient of skewness 
(v1), ¥2 measures the heaviness of the tail of a distribution. For the normal 


92 The Analysis of Variance 


population 44 = 3145 so that y2 = 3. The lighter-tailed distributions will have 
a large pile-up near yz and so y2 > 3. The heavier-tailed distributions, such as 
at distribution, will have less pile-up about jz and so y2 < 3. This is illustrated 
in Figure 2.7. 


fy) fy) fy) 


¥2 > 3 Y¥2 <3 ¥, = 3 


FIGURE 2.7. Curves Exhibiting Positive and Negative Kurtosis and the Normal 
Distribution. 


The quantity (2.21.5) can be estimated by 


y= ay) (2.21.6) 
where 


n 
m4 = 


(yi —y)4/n and mz = (yj — 9)°/n. 
t=] 


i=l 


For large sample sizes (n > 1,000) a test of the null hypothesis that y2 = 3 can 
be based on the statistic 


ee aaa (2.21.7) 


where again Z has approximately a standard normal distribution. Unfortunately, 
one seldom encounters a sample size with 1,000 or more observations and the 
test statistic (2.21.7) has very little practical utility. For smaller sample sizes, 
however, upper and lower percentage points of the distribution of 72 have been 
tabulated and can be used to establish the veracity of the null hypothesis. Tables 
of critical values are given in Pearson and Hartley (1970). Some of these values 
are reprinted in Appendix Table XVII(b). 

Geary (1935,1936) developed an alternative test criterion for kurtosis based 
on the statistic 


G = Mean deviation/Standard deviation 
n 
So ly: — yI/n 
_ i=l 


2.21.8 
Jr — 


One-Way Classification 93 


The significance values of the statistic (2.21.8) have been tabulated for sample 
sizes down ton = 11. If y is anormal deviate, the value of G when determined 
for the whole population is 0.7979. Positive kurtosis yields higher values and 
negative kurtosis lower values of G. When applied to the same data, the statistics 
?, and G usually agree well in their conclusions. The advantages of G are that 
tables are available for smaller sample sizes and that G is relatively easier to 
compute. 


OTHER TESTS FOR NORMALITY 


The foregoing procedures are some of the classical tests of normality. Over 
the years a large number of other techniques have been developed for testing 
for departures from normality. In the following we describe some powerful 
omnibus tests proposed for the problem. For further information on tests of 
normality, see Royston (1983, 1991, 1993a,b,c). 


Shapiro-Wilk’s W test 
Shapiro and Wilk (1965) proposed a relatively powerful procedure based on 


the statistic 
' 2 
(» qj vs) 
Wan (2.21.9) 
YO — 9 
i=! 


where yy < 2) < +++ < Mn) represent the order statistics and the coefficients 
a;’s are the optimal weights for the weighted least squares estimator of the 
standard deviation for a normal population. Inasmuch as a,_;4; = —a;, the 
expression 4 ai yy can be written as yy An—i+1 (Mn—i41) — Yu)) where 
k = n/2, ifn is even, or (n — 1)/2, if n is odd. For n odd, the middle observation 
is used in the calculation of ye (y; — 9)’, but is not used in the calculation of 
yg Gn—i+1(Vn-i+1) — Wa). Thus, for n odd, ai.41)/2 = Ax41 Appears as zero in 
Appendix Table XVIII. Also note that the W test is two-sided because the test 
statistic (2.21.9) is in a quadratic form. The hypothesis of normality is rejected 
at the a significance level if W is less than the (1 —@)th quantile of the null distri- 
bution of W. The coefficients a;, for 2 <n < 50, were given by the authors and 
some selected values are given in Appendix Table XVIII. A short table of critical 
values of the statistic (2.21.9) originally given by Shapiro and Wilk (1965) 1s also 
reprinted in Appendix Table XIX. Royston (1982a, b) has provided an approx- 
imation to the null distribution of W and a FORTRAN algorithm for n < 2000. 
The Shapiro and Wilk W test is one of the most powerful omnibus tests for test- 
ing normality. Extensive empirical Monte Carlo simulation studies by Shapiro 
et al. (1968) and Pearson et al. (1977) have shown that W is more powerful 
against a wide range of alternative distributions. The test is found to be good 
against short or very long-tailed alternatives even for a sample as small as 10. 


94 The Analysis of Variance 


Example 1. To illustrate the procedure, consider a sample of 10 obser- 
vations given by 2.4, 2.7, 2.6, 3.4, 3.2, 3.5, 3.2, 3.4, 3.6, and 3.5. The 
ordered statistics are determined as 


24,276,221, 3:2; 3.2, 3:4, 514, 5:93.9-9} 9.0, 


Since n = 10, we have k = 5. Using the 5 coefficients a1, a7, a3, a4, as 
from Appendix Table XVIII, we obtain 


k 
YS ani Oan-i4) — Yu) 


i=] 
= 0.5739(3.6 — 2.4) + 0.3291(3.5 — 2.6) 
+.0.2141(3.5 — 2.7) + 0.1224(3.4 — 3.2) + 0.0399(3.4 — 3.2) 


= 1.18861. 


Furthermore, for the given set of observations, )-)_,(y; — y)? = 1.645. 
Hence, W =(1.18861)7/1.645 =0.859. From Appendix Table XIX, the 
5 percent critical value of the W statistic is W(10, 0.05) =0.842. Since 
W > W(10, 0.05), we cannot reject the hypothesis of normality and con- 
clude that it is reasonable to assume that the data are normally distri- 
buted. 


Shapiro-Francia’s test 
Shapiro and Francia (1972) proposed a modification of the W statistic defined 


by 
. 2 
(» bj vo] 
i i=] 
> O01 — 9 
i=] 


where the coefficients b; are determined by 


W (2.21.10) 


with m; representing the expected values of the order statistics from a unit 
normal distribution. Inasmuch as b,_;41 = —b;, the expression ae bi yi) Can 
be written as eae bn—i41 Yn-i+1) — Yay) Where k = n/2, if n is even, or (n — 
1)/2, if n is odd. For n odd, the middle observation is used in the calculation of 


One-Way Classification 95 


ys (y; — y)*, but is not used in the calculation of ae bn—i41 Mn-i41) — Ya)). 
Again, note that W’ test is two-sided because the test statistic (2.21.10) is in 
a quadratic form. The hypothesis of normality is rejected at the a significance 
level if W’ is less than (1 —a@)th quantile of the null distribution of W’. Extensive 
tables of m; are given in Harter (1961, 1969b). A small table of critical values 
of the statistic (2.21.10) is given by Shapiro and Francia (1972). 


Example 2. To illustrate the procedure, we use the data on birthweights 
of twelve piglets in a particular litter from an experiment reported by 
Royston et al. (1982). The data have also been referred to and analyzed by 
Royston (1993c). It is widely believed that piglets provide a good model 
for the human neonate, especially in studies involving turnover of glucose. 
The order statistics of birthweight data and the corresponding expected 
values of the order statistics from a unit normal distribution are given in 
Table 2.12. 


TABLE 2.12 
Calculations for Shapiro Francia’s Test 
2 3 4 5 6 7 8 


858 862 992 1006 1018 1020 1079 1088 1110 1120 1166 


» ~1.6292 —1.1157 -—0.7929 —0.5368 —0.3122 —0.1025 0.1025 0.3122 0.5368 0.7929 1.1157 1.6292 
Forn = 12,k = 6, the coefficients b;,i = 1,2,...,6, are determined 


feos mtr: op OO ee 
' /oR4718 | JfO847718 
0.5368 0.7929 


PSs EE AGT te. Shp e507, 
> /9-B4778 + /9.84778 


1.1157 1.6296 


Pet S555... aad: be 0 510 
> /9-84778 © \/9.84778 
Now, 


k 
> bn—i41 Ven-i41) — Ya) 


= 0.5192(1166 — 605) + 0.3555(1120 — 858) 
+ 0.2527(1110 — 862) + 0.1711(11088 — 992) 
+ 0.0995(1079 — 1006) + 0.0327(1020 — 1018) 


= 470.8363; 


96 The Analysis of Variance 


Example 2 (continued) 


and 


S “(i — 5)? = 263,616.6667. 


i=] 


Hence, the Shapiro-Francia Statistic W’ is given by 


W’ = (470.8363)*/263,616.6667 = 0.841. 


Since the critical values of the W’ statistic are not readily available, we 
employ a normal approximation due to Royston (1993c). It can be shown 
that the statistic log,(1 — W’) is approximately normally distributed with 
mean f& = —1.2725 + 1.052(v — wu) and standard deviation 6 = 1.0308 — 
0.26758(v + 2/u) where u = log,(n) and v = log,(u). The values of the 
normal deviate Z’ = {log,(1 — W’) — f2}/6 are referred to the upper-tail 
critical values of the standard normal distribution. Values of Z’ > 1.645 
indicate departures from normality at the 5 percent significance level. For 
the birthweight data considered previously, 


fe = —2.929, 6 =0.572, 


Z’ = {log,(1 — 0.841) — (—2.929)}/(0.572) = 1.91. 


Since Z’ > 1.645, the hypothesis of normality of the birthweight data is 
rejected (p = 0.028). 


D’Agostino’s D test 
D’ Agostino (1971) proposed a test statistic 


n 


] 
2 f = ain + db 


pe i=l 


n jn xXer — yy’ 
i=l 


which is also a modification of the W statistic where the coefficients a; are 
replaced by W; =i — +(n + 1) and, thus, no tables of coefficients are needed. 


(2.21.11) 


One-Way Classification 97 


Note that in contrast to W and W’ tests, the D test is two-sided since the 
statistic (2.21.11) is in a linear form. The hypothesis of normality is rejected 
at the aw significance level if D is less than the a/2th quantile or greater than 
(1 — @/2)th quantile of the null distribution of D. The test, originally proposed 
for moderate sample sizes, is also an omnibus test and can detect deviations 
from normality both for skewness or kurtosis. Tables of percentage points of the 
approximate standardized D distribution of the statistic (2.21.11) were given by 
D’ Agostino (1972). The test is computationally much simpler than the Shapiro- 
Wilk’s test. The studies by Theune (1973) have shown that the Shapiro-Wilk’s 
test is preferable over D’ Agostino’s test for sample sizes up to 50 for lognormal, 
chi-square, uniform, and U-shaped alternatives. 


Example 3. To illustrate the procedure, we again consider the data of 
Example 1. For the given data set, 


a ae | ‘ 
»~ f = ain si Db = Sina = ae +r 1) 0% 
i=1 


i=1 i=1 


1(2.4) + 2(2.6) + --- + 10(3.6) 


1 
— 510+ 1)(2.4+2.6+---+3.6) 


— 184.2 — (5.5)(31.5) 
— 10.95 


and, as before, )-”_,(y; —¥)” = 1.645. Hence, D = 10.95/{10./10(1.645)} 
= 0.26998. From Appendix Table XX, the 5 percent critical value of the 
D statistic is D(10, 0.05) = (0.2513, 0.2849). Since the calculated value 
of D lies in this interval, the hypothesis of normality is not rejected and 
we may conclude that it is reasonable to assume that the data are normally 
distributed. 


For discussions of tests especially designed for detecting outliers, see Hawkins 
(1980), Beckman and Cook (1983), and Barnett and Lewis (1994). Robust esti- 
mation procedures have also been employed in detecting extreme observations. 
The procedures give less weight to data values that are extreme in comparison 
to the rest of the data. Robust estimation techniques have been reviewed by 
Huber (1981) and Hampel et al. (1986). 


TESTS FOR HOMOSCEDASTICITY 


If there are just two populations (1.e., a = 2), the equality of two popula- 
tion variances can be tested by using the usual F test from the fact that the 


98 The Analysis of Variance 


statistic 


_ Sifor 
— S3/oz 


has the Snedecor’s F distribution with n, — 1 and nz — 1 degrees of freedom. 
Here, ao? and of are population variances and Se and S? are the corresponding 
sample estimators based on independent samples of sizes n; and nz, respec- 
tively. However, with a > 2, rather than making all pairwise F tests, we want 
a single test that can be used to verify the assumption of equality of popu- 
lation variances. There are several tests available for this purpose. The three 
most commonly used tests are Bartlett’s, Hartley’s, and Cochran’s tests.*> The 
Bartlett’s test compares the weighted arithmetic and geometric means of the 
sample variances. The Hartley’s test compares the ratio of the largest to the 
smallest variance. The Cochran’s test compares the largest sample variance to 
the average of all the sample variances. We now describe these procedures and 
illustrate their applications with examples. They, however, have lower power 
than is desired for most applications and are adversely affected by nonnormality. 
In the following, we are concerned with testing the hypothesis: 


Hy:07 =o3 =--- =O 
versus (2.21.12) 
A: a; a oF for at least one (i, 7) pair. 


Bartlett’s test 
The basic idea under the Bartlett’s (1937a, b) test is as follows. Given the ob- 
servations y;;’s from model (2.1.1), let 


] a 
Ta = >——— i - 1S? 


and 


2° For a discussion of an exact test based on the generalized likelihood ratio principle, which 
is asymptotically equivalent to the Bartlett’s test, see Weerahandi (1995). For a discussion 
of a general class of tests for homogeneity of variances and their properties, see Cohen and 
Strawderman (1971). 


One-Way Classification 99 


where 


ni 
ae 
Oi — Yi) 
I Ee 
ny — ] 
It should be noted that T, and Tg are weighted arithmetic and geometric av- 
erages of the S?’s which are the usual sample variances of the observations at 


different factor levels. It is well known that 
Tg <Ta 


and the two averages are equal if all S?’s are equal. Thus, the greater the variation 
among the S?’s the farther apart the two averages will be. Hence, if the ratio 


R=T1,/Tc 


is close to 1, we have the evidence that the population variances are equal. If R 
is large, it would indicate that the population variances are unequal. The same 
conclusion would follow if we use log,(R) = log,(7T4) — log,(TG) instead of 
R. Thus, Bartlett’s test is based on the statistic R or log, (R); rejecting the null 
hypothesis if the statistic is significantly greater than unity. 

Inasmuch as the sampling distribution of R or log,(R) is not readily available, 
Bartlett considered two approximations of R. First, for large sample sizes, a 
function of log,(R) has approximately a chi-square distribution with a — 1 
degrees of freedom under the hypothesis that the population variances are equal. 
More specifically, if each n; > 5, the statistic 


K 
B= ——, (2.21.13) 
1+L | 
where 
K = ) (nj — 1) log,(T4) — )_(n; — 1) log, (5?) (2.21.14) 
i=l i=l 
and 
1 - ] ] 
LSS ———_ — —_____—— 2.21.15 
3(a — 1) 2D n; — 1 = ( ) 
y(n - 1) 
i=] 
6 Equivalently, Bartlett’s test can be based on the statistic R-! = Tg/Ta, rejecting the null 


hypothesis if the statistic is significantly smaller than unity. 


100 The Analysis of Variance 


has approximately a chi-square distribution with a — 1 degrees of freedom (see 
also Nagasenkar (1984)). Thus, if the calculated value of the statistic B exceeds 
x°[a—1, 1—«a], the 100(1 —@)th percentage point of the chi-square distribution 
with a — 1 degrees of freedom, we reject the null hypothesis that the population 
variances are equal. The accuracy of this approximation has been considered 
by Bishop and Nair (1939), Hartley (1940), and Barnett (1962). 

The chi-square approximation to the distribution of the Bartlett’s test statistic 
(2.21.13) is not appropriate when any of the n;’s are less than five. An approx1- 
mation which is more accurate when some of the n;’s are small is based on the 
F distribution. The approximation consists of considering the statistic 


K | 
eee eee (2.21.16) 
v}(M oa K) 

where 
vy) =a-—l, (2.21.17) 
vy = (a+ 1)/L2, (2.21.18) 

and 

M = w/{1 —L+2/v}, (2.21.19) 


which has a sampling distribution approximated by an F distribution with v, 
and v> degrees of freedom. The values of vz will usually not be an integer 
and it may be necessary to interpolate in the F table. Good accuracy can be 
achieved by the method of two-way harmonic interpolation (see, e.g., Laubscher 
(1965)) based on the reciprocals of the degrees of freedom. Usually, however, 
the observed value of B’ will differ significantly from the tabulated value and 
in that case an interpolation may not be required. When k =2, and for equal 
sample sizes, Bartlett’s test reduces to the two-sided variance ratio F test. When 
two sample sizes are unequal, the two methods, however, may give different 
results (Maurais and Quimet (1986)). 


Remark: Exact critical values obtained from the null distributions of Bartlett’s statistic 
for the case involving equal sample sizes have been given by Harsaae (1969), Glaser 
(1976), and Dyer and Keating (1980). For very small values of n;’s tables are given in 
Hartley (1940) and Pearson and Hartley (1970). For equal sample sizes, some selected 
percentage points of the distribution are given in Appendix Table XXI. Algebraic ex- 
pressions for determining exact critical values for Bartlett’s test for unequal sample sizes 
have been derived by Chao and Glaser (1978) and Dyer and Keating (1980). 


One-Way Classification 101 


Example 4. To illustrate the procedure, consider the data given in 
Table 2.5. The calculations needed for Bartlett’s test are summarized in 
Table 2.13. On substituting the appropriate quantities into (2.21.14) and 
(2.21.15), and then into (2.21.13), we obtain 


K = 2.1005, 
L = 0.1736, 


B = 2.1005/(1 + 0.1736) = 1.7898. 


Since from Appendix Table IV, x7[4, 0.95] = 9.49 with a p-value of 
0.774, we do not reject the null hypothesis that the five variances are all 
equal. 


TABLE 2.13 
Calculations for Bartlett’s Test 


Treatment 7; — 1 S? log, S2_ (nj —1)S?_— (nj — 1) log, S? 


I 


2 16.0000 2.7726 32.0000 5.5452 
3 6.0000 1.7918 18.0000 5.3754 
2 3.0000 1.0986 6.0000 2.1972 
3 4.0000 1.3863 12.0000 4.1589 
2 4.0000 =1.3863 8.0000 2.7726 


To use the F approximation given by (2.21.16), on substituting the 
appropriate quantities into (2.21.17), (2.21.18), and (2.21.19), and then 
into (2.21.16), we have 


vy); = 4 
vy = 6/(0.1736)? = 199.1, 


M = 199.1/(1 — 0.1736 + 2/199.1) = 238.0311, 


(199.1)(2.1005) 
Bie ee SO, 
4(238.0311 — 2.1005) 


102 The Analysis of Variance 


Example 4 (continued ) 


Furthermore, for a = 0.05, we obtain from Appendix Table V that 
F[4, 120;0.95] = 2.45 and F[4, 00;0.95] = 2.37. Using the harmonic 
interpolation, based on the reciprocals of the degrees of freedom, we 
have 


] 


F[4, 199.1;0.95] =: 2.45 + 422.1 120 (2.37 BAS = 9 a1: 


oo = 120 


Since B’ = 0.44 < 2.41 witha p-value of P{F[4, 199.1] > 0.44} = 0.779, 
we may conclude that the five variances are all equal. Thus, the conclusion 
from the Bartlett’s test using the F approximation is the same as using the 
chi-square approximation. 


Example 5. In this example, we simultaneously illustrate the Shapiro- 
Wilk’s W test for normality followed by the Bartlett’s test for homogeneity 
of variances. We further describe the use of exact critical values for the 
Bartlett’s test statistic given in Appendix Table XXI. Dyer and Keating 
(1980) reported and analyzed data on the sealed bids on each of five Texas 
offshore oil and gas leases selected from 110 leases issued on May 21, 
1968. Using the probability plots it was shown that the bids on each of 
the five leases are lognormally distributed. The logarithmic scores of the 
sealed bids on each of five leases are given in Table 2.14. 

Proceeding as in Example 1, the Shapiro-Wilk’s W statistic for each of 
the five groups of leases is determined as: 


W,(8) = {0.6052(16.269 — 13.521) + --- 
+ 0.0561(15.035 — 14.847)}*/7(0.842) = 0.982, 
W2(10) = {0.5739(16.292 — 12.597) + - -- 
+ 0.0399(14.430 — 14.307)}? /9(1.282) = 0.970, 
W3(5) = {0.6646(13.980 — 11.629) +--- 
+ 0.2413(13.273 — 12.134)}?/4(0.859) = 0.982, 
W4(12) = {0.5475(17.589 — 13.003) + --- 
+ 0.0922(15.539 — 15.370)}7/11(1.883) = 0.960, 


One-Way Classification 103 


Example 5 (continued ) 


TABLE 2.14 
Data on Log-bids of Five Texas Offshore Oil and 
Gas Leases 


Lease No. 


I i Hl IV Vv 


$ 16.269 $ 16.292 $ 13.980 $17.859 $17.188 
15.733 15.223 13.273 16.557 16.712 
15.256 15.100 12.616 16.264 16.259 
15.035 14.995 12.134 15.957 16.128 
14.847 14.430 11.629 15.910 15.463 
14.223 14.307 15.539 15.100 
13.987 13.520 15.370 14.565 
13.521 13.463 14.847 14.519 
13.129 14.785 13.521 

12.597 13.521 13.014 

13.503 13.003 

13.003 12.622 

12.530 


ny=8, S?=0.842, n2=10, SF =1.282, 


n3=5, S3=0.859, ng=12, S{=1.883, ns=13, S3=2.635. 


Source: Dyer and Keating (1980). Used with permission. 
and 


W5(13) = {0.5359(17.188 — 12.530) + --- 
+ 0.0539(15.100 — 14.519)}°/12(2.635) = 0.928. 


From Appendix Table XIX, the 5 percent critical values for the W statistic 
in each of the five groups are: 


W,(8, 0.05) = 0.818, W2(10, 0.05) = 0.842, 
W3(5, 0.05) = 0.762, W4(12, 0.05) = 0.859, 


W;(13, 0.05) = 0.866. 


Since in each group, W;(n;) > W;(n;, 0.05), the lognormality of the bids 
data is not rejected at the 5 percent significance level. In fact, it can be 
verified that the hypothesis of lognormality is sustained at a significance 
level of 0.5 or lower. 


104 The Analysis of Variance 


Example 5 (continued) 


We now test for homogeneity of variances using Bartlett’s test. For the 
log-bids data, the weighted arithmetic and geometric means of S?, T4 and 
Tg, are determined as 


5 
Yai — DS? 
_ t= 


r (7 x 0.842) + ---+ (12 x 2.635) 
A = = 


5 TH-+++12 


Yiu -— 1) 


i=] 


=1.7023 


5 
; i—D/Li-D 


TG = Ts) 


= (0.842)(7 72) . .. (2,635) =") 
= 1.5560. 
Hence, the Bartlett’s test statistic B is given by 

B =Tg/Ta = 1.5560/1.7023 = 0.9141. 


From Appendix Table XXI, the 5 percent critical value is approximately 
determined as 


B(8, 10, 5, 12, 13; 0.05) 


(8 10 5 
x (= )o.7512 ae ($5 ).0.8025 + (=; 0.5982) 


12 13 
i (=; ).0.8364 " (=; )0-8498 


= 0.7935. 


Since B > 0.7935, the hypothesis of homogeneity is not rejected at the 
5 percent significance level. As a matter of fact, it can be verified that 
the approximate 25 percent critical value is 0.8757 and the hypothesis of 
homogeneity is sustained at a significance level of 0.25 or lower. 


Hartley’s test 

Hartley (1950) developed a test for the hypothesis (2.21.12) when the sample 
sizes are all equal; that is, n; = n,i = 1,2,...,a. The test represents a natural 
extension to the F test for the case with a = 2. If the S?’s denote the sample 


One-Way Classification 105 


variances, then the test statistic is defined by 


es | (2.21.20) 
min (S?) ° _ 


where max(S?) and min(S?) denote the largest and smallest sample variances, 
respectively. Naturally, when the population variances are all equal, the value 
of H would be expected near 1 and greater the variation between S?’s, the 
larger the value of H. The decision rule consists of rejecting the null hypothesis 
(2.21.12) if the calculated value of H exceeds H[a, v; 1 —a], the 100(1 — @)th 
percentage point of the distribution of H. 


Remark: The distribution of the statistic (2.21.20) depends on a and n and initial 
tables for 1 and 5 percent critical values were originally given by Hartley (1950). 
Later, David (1952) gave tables fora = 0.05,0.01,a = 2(1)12, andv =n—-—1 = 
2(1)10, 12, 15, 20, 30, 60, co. These tables are also given in Owen (1962) and Pearson 
and Hartley (1970). Some selected percentage points of the H distribution are given in 
Appendix Table XXII. 


Example 6. To illustrate the procedure, we consider the data given in 
Table 2.3. The sample variances are as follows: 


S? = 406.8, S;=97.6, S;=80.8, and Sj =69.2 


Now, we have 


max (S?) = 406.8, min (S7) = 69.2, 


and the statistic (2.21.20) is given by 


406.8 
H = — = 5.88. 
69.2 


From Appendix Table XXII, we have H[4, 5;0.95] = 13.70 and so we do 
not reject the null hypothesis that the four variances are all equal. 


Cochran’s test 

Cochran (1941) developed a test for homoscedasticity especially designed for 
the case when one variance is very much larger than the others and the sample 
sizes are all equal. The test statistic 1s given by 


max (S?) 


ys 
1=] 


(2.21.21) 


106 The Analysis of Variance 


The decision rule consists of rejecting the null hypothesis (2.21.12) if the cal- 
culated value of C exceeds C[a, v; 1 — a], the 100(1 — a)th percentage point 
of the distribution of C. 


Remark: The distribution of the statistic (2.21.21) depends ona and n, and initial tables 
for the upper 5 percentage points fora = 3(1)10 and v = n—1 = 1(1)6(2)10 were given 
by Cochran (1941). Later, Eisenhart and Solomon (1947) gave tables fora = 0.05, 0.01, 
a = 2 (1) 12, 15, 20, 24, 30, 40, 60, 120, oo, and v = 1 (1) 10, 16, 36, 144, oo. These 
tables are also given in Pearson and Hartley (1973). A more comprehensive tabulation of 
the statistic C appears in a publication by Japanese Standards Association (1972). The 
latter publication presents the percentage points of C fora =2 (1) 20;n =2 (1) 31,41, 
61, 121, 00; anda = 0.05, 0.01. Some selected percentage points for the distribution of 
C are reprinted in Appendix Table XXIII. 


Example 7. To illustrate the procedure, we again consider the data of 
Table 2.3 as in the case of the Hartley’s test. The sample variances lead to 
the value of the test statistic (2.21.21) given by 


406.8 
= 0.6216. 


a 
406.8 + 97.6 + 80.8 + 69.2 


From Appendix Table XXIII, we have C[4, 5; 0.95] = 0.5895 and there- 
fore we reject the null hypothesis (2.21.12) that the variances are all equal. 
Note that for the same data Cochran’s test leads to the rejection of the null 
hypothesis whereas Hartley’s test fails to reach the critical value. 


Comments on Bartlett’s, Hartley’s and Cochran’s tests 
(1) In most practical situations, the Hartley’s and Cochran’s tests will lead to 
similar conclusions. Since Cochran’s test utilizes more information 1n the sam- 
ple data, it is generally more sensitive than Hartley’s test. When the normality 
assumption can be relied upon, Bartlett’s test is more powerful than other tests 
(Gartside (1972)). 

(11) Both Hartley’s and Cochran’s tests require that all sample sizes be equal. 
If the sample sizes are unequal, but do not differ greatly, they may still be 
used as approximate tests. In this case, the value of n would be the average 
sample size for the determination of the percentage points of the test statistics. 
Some statisticians recommend the use of the largest n for this purpose. The 
procedure will result in the probability of type I error being slightly larger than 
the prescribed value. 

(111) All the test procedures are sensitive to departures from normality (Box 
(1953); Box and Anderson (1955)). That is, if the populations from which 


One-Way Classification 107 


samples are taken are not normally distributed, the actual level of significance 
may differ greatly from the specified one. They all tend to mask existing differ- 
ences in variances if the kurtosis 1s smaller than zero, or to exhibit nonexistent 
differences if the kurtosis is greater than zero. Thus, the values of the test statis- 
tics may lead to an erroneous rejection of the null hypothesis. Therefore, it is not 
advisable to test for homoscedasticity unless there is sufficient evidence to as- 
sume that the distributions are at least approximately normal. It is recommended 
that any homoscedasticity test be used only when preceded by a preliminary 
test which does not reject normality. However, if the test is performed as a 
check before using an analysis of variance procedure, then the rejection of the 
null hypothesis indicates that at least one of the two underlying assumptions is 
violated. For example, it has been found that Bartlett’s test is a good one for 
testing departures from normality. 

(iv) As noted in the preceding section, the analysis of variance F test is 
not much affected by the unequal variances as long as the differences in the 
variances are not too large and the sample sizes are nearly equal. Hence, a 
fairly low level of a may be justified in conducting the test for the equality of 
variances when the sample sizes are nearly equal. This would be appropriate 
in determining the aptness of the analysis of variance model (2.1.1) since only 
large differences between variances need to be detected. 


Other tests of homoscedasticity 

The preceding tests of homoscedasticity are traditional tests based on normal 
theory for testing the null hypothesis of equal variances. However, they all 
are very sensitive to the assumption of normality and give too many signifi- 
cant results for data coming from a long-tailed distribution. In recent years, a 
number of tests have appeared in the literature that are less sensitive to nor- 
mality in the data and are found to have a good power for a variety of pop- 
ulation distributions. Levene (1960) proposed a test that considers the scores 
25 = OS y;.) as identically distributed normal variates and applies the usual 
F test on these scores. A significant difference between means of the trans- 
formed scores is considered as evidence of significant differences in variances 
of the groups. Levene (1960) also proposed using F tests based on the scores 
zij = Ivy — Yi, Zij = loge lyiy — ¥i.|, and zij = lyiy — i... 

Following Levene (1960), a number of other robust procedures have been 
proposed that are essentially based on techniques of applying analysis of vari- 
ance to transformed scores. For example, Brown and Forsythe (1974c) proposed 
using the transformed scores based on the absolute deviations from the median. 
In order to increase power when sample sizes are odd, Ramsey and Brailsford 
(1989) suggested that the median be replaced by the pseudo median equal to the 
midpoint of the scores just above and below the median. A somewhat different 
approach known as the jackknife was proposed by Miller (1968) where the orig- 
inal scores in each group were replaced by the contribution of that observation 


108 The Analysis of Variance 


to the group variance.””? O’Brien (1979, 1981) proposed a procedure that 1s 
a blend of Levene’s squared deviation scores and the jackknife. It performs 
analysis of variance using 


(n; — 1.5) nj(yij — 4)? — 0.5 57(n; — 1) 
(nj — 1)(n; — 2) 


where y;, and Se represent mean and variance, respectively, for the i-th factor 
level. 

In recent years, there have been a number of studies investigating the ro- 
bustness of these procedures and they point toward the robustness of Brown- 
Forsythe and O’Brien procedures. More recently, Algina et al. (1995) have 
proposed a procedure, called maximum test for scale, in which the test statistic 
is the more extreme of the Brown-Forsythe and O’Brien test statistics. Some 
limited simulation work for the two-sample case indicates better properties for 
type I and type IJ error rates than either the Brown-Forsythe or O’ Brien proce- 
dure. For further discussions and details, the reader is referred to Games et al. 
(1972), Hall (1972), Layard (1973), Levy (1978b), Keselman et al. (1979), 
Conover et al. (1981), Olejnik and Algina (1987), Micceri (1989), Ramsey 
(1994), and Algina et al. (1995). 


2.22 CORRECTIONS FOR DEPARTURES 
FROM ASSUMPTIONS OF THE MODEL 


If the data set in a given problem violates the assumptions of the analysis of 
variance model (2.1.1), a choice of possible corrective measures is available. 
One is to modify the model. However, this approach has the disadvantage that 
more often than not the modified model involves fairly complex analysis. An- 
other approach may be to consider using some nonparametric tests which do not 
make the normal theory assumption for inference problems. A third approach 
to be discussed in this section is to use transformations on the data. Sometimes 
it is possible to make an algebraic transformation of the data to make them 
appear more nearly normally distributed, or to make the variances of the error 
terms constant. Conclusions derived from the statistical analyses performed on 
the transformed data are also applicable to the original data. In this section, we 
briefly discuss some commonly used transformations to correct for the lack of 


27 The jackknife procedure computes sample variances within each group by deleting one 
observation at a time. Thus, in the i-th group, n; variances are computed as follows: 


] 7 l ni 
g aisiee mo AD “ 
[Or a4 ) (yij — Vie)” where jie = ee 
— y 
e i 


The analysis of variance is performed on the transformed scores zjg = n; log, (s?) — (nj; — 1) 
log. (874) (a0 (Pens 16 cae — sal (PnP ee n;) and the test statistic is the usual F statistic with 
a — 1 and N — a degrees of freedom. 


One-Way Classification 109 


normality and homoscedasticity. Tukey (1955) discussed the use of transforma- 
tions such that effects in the transformed scale are additive. Although individual 
transformations that will correct for lack of normality, homoscedasticity, and 
nonadditivity may be different, Box and Cox (1964, 1982) found that often a 
single transformation will simultaneously rectify all the problems. 


Remark: For further discussions of transformations, the reader may refer to Bartlett 
(1936, 1947), Cochran (1940), Curtiss (1943), Bartlett and Kendal (1946), Eisenhart 
(1947b), Freeman and Tukey (1950), Tukey (1957), Draper and Hunter (1969), Cox 
(1977), Draper and Smith (1981, pp. 220-221), Efron (1982), Berry (1987), and Hoaglin 
(1988). Natrella (1963, Chapter 20) provides a detailed and thorough discussion of the 
use of transformations. An extremely thorough and detailed monograph on transfor- 
mation methodology has been prepared by Th6ni (1967). An excellent and thorough 
introduction and a bibliography of the topic can be found in a review paper by Hoyle 
(1973). For a more recent bibliography of articles on transformations, see Draper and 
Smith (1981, pp. 683-684). 


TRANSFORMATIONS TO CORRECT LACK OF NORMALITY 


Here, we discuss some transformations to correct for the departures from nor- 
mality. 


Logarithmic transformation 
Suppose the data are distributed according to the relationship 


yij = Bla + ij), (2.22.1) 
where the e;;’s are normally and independently distributed, each with mean zero 


and variance a2. Then, on making a logarithmic transformation of (2.22.1), we 
get 


log, (yij) = log,(B) + a; + ei;, 
which can be rewritten as 
Yj = +a; + ej. (2.22.2) 
From (2.22.2), we notice that although the y;;’s are not normally distributed, 
the transformed variables y;,’s are. This may be the case when the distribution 


of yj;’s is Skewed. 


Square-root transformation 
Suppose the sample observations are given by the relationship 


yij =(U+Q; + eij)’, (2.22.3) 


110 The Analysis of Variance 


where, as in (2.22.1), the e;;’s are normally and independently distributed with 
mean zero and variance 02. Then, on making thé square-root transformation, 
we get 


Vij = JVij = UT Oj + ij. (2.22.4) 
From (2.22.4), we notice that although the y;;’s are not normally distributed, 


the transformed variables y; j 8 are. This may be the case when the yj; 's are 
nonnegative real numbers and their distribution is skewed to the right. 


Arcsine transformation 
Suppose the sample scores y;;’s are binomial proportions with mean jp based 
on samples of size n. Then, the transformed scores”® 


y;; = 2arcsin /yij (2.22.5) 
are approximately normally distributed with approximate mean yz’ = 2 arcsin 
/p and variance 1/n. The transformation (2.22.5) does not perform as well at 
the extreme ends of the possible values (near 0 and n). Anscombe (1948) and 


Freeman and Tukey (1950) proposed some improved arcsine transformations 
given by 


y;; = arcsin /(ny;; + 3/8)/(n + 3/4) 


] : nNyij : ny;; + 1 
‘= — | arcsin,/ — arcsin ,/ —~—— 
a ; | ry aes n+1 


TRANSFORMATIONS TO CORRECT LACK OF HOMOSCEDASTICITY 


and 


There are several types of data in which the variances of the error terms are not 
constant. If there is evidence of some systematic relationship between treatment 
mean and variance, homogeneity of the error variance may be achieved through 
an appropriate transformation of the data. Bartlett (1936) has given a formula 
for deriving such transformations provided the relationship between jz; and o? 
is known. In many cases where the nature of the relationship is not clear, the 
experimenter can, through trial and error, find a transformation that will stabi- 
lize the variance. We now consider some commonly employed transformations 
to stabilize the variance. 


28 If the data come from a population having the so-called negative binomial distribution, then 
the use of inverse hyperbolic sines may be more appropriate (Beall (1942); Bartlett (1947); 
Anscombe (1948)). 


One-Way Classification 111 


Logarithmic transformation 

This transformation is applicable when a? oe jee or 0; a j4;, that is, when the 
factor level standard deviation is proportional to the corresponding mean. This 
type of situation arises when the distribution of scores is markedly skewed. The 
transformation is also applicable when the scores are standard deviations. In 
this case s; /¥;. tends to be constant and so a logarithmic transformation, that is, 


yi; = log.(yij), (2.22.6) 


would stabilize the variance. If some of the measurements are small (particularly 
zero), the recommended transformation is (Bartlett 1947) 


y;; = log. (yij + 1). (2.22.7) 


Square-root transformation 

This transformation is applicable when 07 « 4;, that is, when the means and 
variances are proportional for each factor level. This type of situation is often 
found when the observed variable y;; is a count, such as the number of auto 
accidents in a given year. In this case, the sample statistic s7/¥;, tends to be 
constant and so a square-root transformation such as 


Vig = Vij (2.22.8) 


would stabilize the variance. If some of the observations y;;’s are very small 
(particularly zero), homogeneity of variance is more likely to be achieved by 
the transformation”? (Bartlett (1936)) 


Yiy = Vy +05. (2.22.9) 


The square-root transformation is usually applied to all data assumed to fol- 
low a Poisson distribution. For a discussion of the use of square-root trans- 
formation to perform analysis of variance for Poisson data, see Budescu and 
Applebaum (1981). 


Reciprocal transformation 

This type of transformation is applicable when o; « yA, that is, when the factor 
level standard deviation is proportional to the square of the corresponding mean. 
In this case s;/ y? tends to be constant and an appropriate transformation to 


*? The transformation y/; = /yij + 3/8 has an even better variance stabilizing property than 
equation (2.22.9) (Anscombe (1948); Kihlberg et al., (1972)). Freeman and Tukey (1950) showed 
that the transformation y; j= JS¥ij + Vyiz +1 will yield similar results as (2.22.9) but is 
preferable for y;; < 2. 


112 The Analysis of Variance 


stabilize the variance is the reciprocal transformation;*° that is, 
Vij = 1/Yij- (2.22.10) 


The transformation (2.22.10) is generally used when y; p= ij | has a definite 
physical meaning and where the possibility of the random variable being less 
than or equal to zero 1s negligible. For example, data on the failures of a machine 
may be collected as either “intervals between failures,” or “the number of fail- 
ures per unit time.” Similarly, the reciprocal of the survival data is related to the 
death rate and the reciprocal of the waiting time unit when some phenomenon 
occurs 1s related to the speed with which the phenomenon occurs. 


Arcsine transformation 

This transformation is applicable when a? x p;(1 — p;), that is, when scores 
are proportions. For example, the factor levels may be different treatment proce- 
dures, the unit of observation is a clinical center, and the observed variable y;; is 
the proportion of patients in the i-th treatment group for the j-th clinical center 
who benefited by the treatment. In this case an appropriate transformation to 
stabilize the variance is the arcsine transformation; that is, 


Y;; = arcsin ./Yjj- (2.22.11) 


The transformed score using (2.22.11) is the angle whose sine is equal to the 
square root of the original score. Tables to facilitate this transformation have 
been prepared (see, e.g., Fisher and Yates (1963); Owen (1962)). 


Square transformation 
If the standard deviation decreases as the corresponding factor level mean in- 
creases, then the transformation 


Yi = Yi; (2.22.12) 


would stablize the variance. The transformation (2.22.12) 1s generally useful 
when the distribution is skewed to the left. 


Power transformation 

When there does not exist a theoretical basis to select a transformation, or the 
transformations described fail to achieve normality or homoscedasticity, a class 
of transformations proposed by Box and Cox (1964) can be used to achieve the 
desired objective. The general form of the transformation is given by 


Yi A4#0 
fi) = (2.22.13) 


30 If y; j Tepresents counts, then y; j= 1/(yij + 1) may be used to avoid a possibility of division 
by zero. 


One-Way Classification 113 


where A is a parameter to be determined from the data. The analyst tries different 
values of A in (2.22.13) until the transformed scores conform to the assumption 
in question. It should be noted that the transformation (2.22.13) includes the 
following simple transformations as special cases. 


] 
A=-1, fO/)=— 
Yij 


A=-0.5, f(vij)= 


A=0, f(yij) =log.Qij) (by definition) 
A=05, fO) = Si 
A=2, f(y) = ¥z- 


These are some of the more commonly used transformations. Still other 
transformations can be found to be applicable for various other relationships 
between the means and variances. Furthermore, the transformations to stabilize 
the variance also often make the population distribution nearly normal. How- 
ever, the use of such transformations may often result in different group means. 
It is possible that the means of the original scores are equal but the means of 
the transformed scores are not, and vice versa. Moreover, the means of trans- 
formed scores are often changed in ways that are not intuitively meaningful or 
are difficult to interpret. 


EXERCISES 


1. In an effort to increase the service life of a handbrake, an automobile 
manufacturing company has developed three new designs. To assess 
their performance against a standard design of a handbrake, 12 auto- 
mobiles of acertain make were randomly chosen and assigned to four 
different groups with 3 cars in each group. The handbrakes of four 
different designs were then randomly assigned to each group with 
each of the 3 cars in every group using a handbrake of one of the 
four designs. The relevant data on service life, measured in months, 
for each handbrake are given as follows. 


Standard Design 21.2 13.4 17.0 
New Design-I 21.4 12.0 13.0 
New Design-II 3.2 9.1 4.2 
New Design-Ill 8.7 35.8 39.0 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if there are signifi- 
cant differences in the average service life in the four groups of 
handbrakes. Use a = 0.05. 


114 


The Analysis of Variance 


(d) Carry out the test for homoscedasticity ata = 0.01 by employing 
(1) Bartlett’s test, 
(ii) Hartley’s test, 
(111) Cochran’s test. 


(e) Would you consider using appropriate contrasts? If so, perform 

the following, and interpret your results: 
(i) Orthogonal contrasts, 

(ii) Tukey’s procedure, 

(1) Scheffé’s procedure. 

(f) If itis found that the measures of service life for handbrakes have 
a distribution skewed to the night, what transformation would be 
appropriate to correct it? Make the required transformation on 
the data and repeat the analyses carried out in parts (b), (c), and 
(d). 

(g) Why were all the automobiles included in the experiment of a 
certain preselected model? Is it possible to generalize the results 
of this study to automobiles of any other model? 


2. A study was carried out to determine if different types of savings in- 


stitutions attract similar amounts of savings after adjusting for factors 
such as advertising, years in operation, and size of the neighborhoods 
of the branches, and so on. A research analyst randomly selected 5 
out of a large number of savings institutions included in the study 
and from each of these 5 institutions, 5 branches were selected at 
random. The total savings, in millions of dollars, in the 25 branches 
included in the study are given as follows. 


Types of Savings Institutions 
A B Cc D E 


37.2 334 37.5 31.0 30.9 
38.4 37.7 366 33.4 37.0 
36.0 388 35.8 36.7 36.2 
31.3 32.8 37.0 39.0 38.1 
32.4 33.7 35.6 37.1 36.8 


(a) Describe the mathematical model and the assumptions involved. 
Would you use Model I or Model II? Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Determine if there is significant evidence to conclude that the av- 
erage accumulated savings are not the same among the different 
types of savings institutions under study. Use a = 0.01. 

(d) Would you consider using contrasts? Explain. 

3. Calculations of the sums of squares for a one-way analysis of variance 


One-Way Classification 115 


from certain experimental data yielded the following results: 


SS3 = 570.23, 

SSw = —19.72, 
and 

SS7 = 550.51. 


What can you say about the correctness of the results? Explain. 

4. A consumer organization carried out a study to determine whether 
the price being offered for a used car differed with the personality 
of the owner of the car. Four individuals were selected for the study, 
and each, pretending to be the owner, was sent to 5 different dealers. 
From each of the 20 dealers selected in the study the price quotes 
were obtained on a five-year old medium price car. The amounts 
offered, in hundreds of dollars, by each of the 20 dealers in the study 
are given as follows. 


Owners 
A B C D 


40 40 34 35 
38 43 37 40 
40 41 38 37 
41 42 40 36 
37 ©4300 «635 34 


(a) Describe the mathematical model and the assumptions involved. 
Would you consider using a Model I or Model II? Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine whether the price 
quotes differ according to the personality traits of the owner. 
Use a = 0.05. 

(d) What are the variance components associated with the assump- 
tions of the Model II? 

(e) Obtain appropriate point and interval estimates of the variance 
components identified in part (d). 

5. Out of three different textbooks published by three leading publish- 
ers, a Statistics professor is trying to choose one for adoption for his 
basic statistics class. He designed an experiment with 30 students of 
his class, whom he randomly assigned into three different groups, 
placing 10 in each group. The three textbooks, from John Wiley, 
Prentice-Hall, and Wadsworth, were then randomly assigned to each 
group. After the end of the course, all the students who completed 
the course took the same examination. The scores of the examination 
are given in the following. 


116 


The Analysis of Variance 


Textbooks 


John Wiley Prentice-Hall Wadsworth 


80 55 65 
80 62 55 
8] 80 62 
71 70 67 
8] 70 58 
75 66 V2 
82 77 70 
78 75 60 
36 52 
84 


(a) Describe the mathematical model and the assumptions involved. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the average 
scores using three different textbooks are the same. Use a9 = 
0.05. 

(d) Carry out the test for homoscedasticity ata = 0.01 by employing 


(i) Bartlett’s test, 
(11) Hartley’s test, 


(111) Cochran’s test. 


. An automobile company wants to know the length of time during 


which the premiums are given by the different agents employed by the 
company. A study was conducted in which four agents were chosen 
at random and the number of transactions completed by each agent 
in a given week were recorded. The delay, in days, for completing 
the transaction was noted for each sample case and the relevant data 
are given as follows. 


Agents 

F i Hl IV 
9 19 21 22 
7 17 30 28 
9 22 32 23 
13 23 26 19 
10 28 29 20 
19 21 30 19 
17 21 21 24 
12 27 28 26 
6 25 33 27 
11 16 20 

10 35 


13 28 


One-Way Classification 117 


(a) Describe the mathematical model and the assumptions involved. 
Would you consider using Model I or Model II? Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if the mean delay 
time varies from agent to agent. Use a = 0.01. 

(d) What are the variance components associated with the assump- 
tions of the Model II? 

(e) Obtain appropriate point and interval estimates of the variance 
components identified in part (d). 

7. Consider an experiment designed to investigate differences in 
blood counts in three groups of monkeys randomly administered 
to two drugs and a control. The data on blood counts are given as 
follows. 


Type of Drug 

A B Control 
11.8 14.8 9.4 
10.9 11.7 10.5 
9.7 14.2 92 
11.4 11.2 10.2 
12.6 11.8 
10.3 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if the average blood 
count varies for three types of drugs. Use a = 0.05. 

(d) Carry out the test for homoscedasticity ata = 0.05 by employing 


(i) Bartlett’s test, 
(11) Hartley’s test, 
(111) Cochran’s test. 


(e) Determine 95% one-sided and two-sided confidence intervals 
for the single contrast comparing control with the mean of the 
other two drugs using the Dunnett’s statistic and interpret your 
results. 

8. A zoologist studying the structural traits of certain species of mam- 
mals classifies them into three groups: small, medium, or large ac- 
cording to the size of the vertebrate. He selects three random samples 
of size 8 from each group and then records the length of each in the 
sample. The relevant data on length measurements in certain standard 
units are given as follows. | 


118 The Analysis of Variance 


Mammal Groups 


Small Medium — Large 


8.1 11.4 8.6 
8.8 11.2 7.1 
10.5 10.6 74 
‘he 7.7 9.0 
9.6 9.5 8.6 
9.8 8.1 9.1 
10.1 9.5 10.3 
12 12.1 9.5 


(a) Describe the mathematical model and the assumptions involved. 

(b) Analyze the data and report the appropriate analysis of vari- 
ance table 

(c) Perform an appropriate F test for the hypothesis that the mean 
length of each group is the same. Use a = 0.01. 

(d) Carry out the test for homoscedasticity ata = 0.01 by employing 


(i) Bartlett’s test, 
(1) Hartley’s test, 
(111) Cochran’s test. 


(e) Would you consider using contrasts? If so, perform the following 
and interpret your results 


(1) Orthogonal contrasts, 
(11) Tukey’s procedure, 
(111) Scheffé’s procedure. 


9. Consider the null hypothesis Hp:a@, = a2 = a3 = a4 = O versus 
the alternative H,: not all a;’s are zero. 

(a) Determine three orthogonal contrasts. 

(b) Are the three orthogonal contrasts given in part (a) unique; that is, 
can you construct two or more separate sets of three orthogonal 
contrasts? 

(c) Can you construct four orthogonal contrasts? 

10. A manufacturing company employs a large number of presses that 
are used to produce certain automobile parts. A study was conducted 
to assess the performance of the presses. A sample of four presses 
was selected at random from the entire plant and then 10 parts were 
taken at random from the production line of each press. The measures 
on the length of the 40 parts were determined and the calculations 
on the sums of squares yielded the following results: 


SSz = 0.0264 and SS, = 0.0380 


(a) Describe the mathematical model and the assumptions involved. 


One-Way Classification 


11. 


en 


(b) Prepare the pertinent analysis of variance table. 

(c) Perform an appropriate F test to determine if the presses are 
similar in their average performance. Use a = 0.05. 

(d) Obtain appropriate point and interval estimates of the variance 
components associated with the model assumed in part (a). 

(e) Test the hypothesis that the between and within component ratio 
is equal to or less than 1/2. Use a = 0.05. 

(f) Find an interval estimate for the between and within compo- 
nent ratio and the intraclass correlation using the confidence 
coefficient of 0.95. 

A study was performed to determine the effect of different varieties of 

fertilizers upon potato yields. Three fertilizers, designated by N, P, 

and K, were used. Each fertilizer was randomly assigned to 10 plots 

and the yields were determined for each of the 30 plots. The yield 
totals corresponding to the three fertilizer groups are 


Yn = 50, Yp = 70, Y; = 100; 


and the total sum of squares is calculated to be 580. 

(a) Describe the mathematical model and the assumptions involved. 

(b) Prepare the pertinent analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the average 
yield for each fertilizer group is the same. Use aw = 0.01. 

(d) Consider the following hypothesis of interest: the average for 
N equals the average of the other two fertilizers. Carry out the 
corresponding test of the hypothesis and state your conclusions. 
Use a = 0.01. 

In a study involving health and nutrition survey, 15 families each 

spending comparable amount in their grocery bills were administered 

a survey questionnaire regarding their dietary habits. The families 

were classified according to whether they lived in a rural, urban or 

suburban district and the data on average daily protein consumption 
are given as follows. 


District 


Urban Suburban Rural 


371 365 491 
334 352 421 
358 362 44] 
300 321 461 
343 342 

302 


(a) Describe the mathematical model and the assumptions involved. 
(b) Analyze the data and report the analysis of variance table. 


119 


120 


13. 


The Analysis of Variance 


(c) Perform an appropriate F test to determine if the average daily 
protein consumptions are equal for the three districts. Use a = 


0.05. 
(d) Carry out the test for homoscedasticity ata = 0.05 by employing 


(i) Bartlett’s test, 
(ii) Hartley’s test, 
(111) Cochran’s test. 


(ec) Would you consider using contrasts? Ifso, perform the following 
and interpret your results: 


(i) Orthogonal contrasts, 
(11) Tukey’s procedure, 
(11) Scheffé’s procedure. 


A study was conducted to study the relationship between intelligence 
and ability to concentrate. Thirty students were randomly selected 
from a large psychology class and were administered tests of intel- 
ligence and concentration ability. The students were classified into 
five groups according to their concentration ability and the data on 
IQ score are given as follows. 


Concentration Ability 


I il Hl IV Vv 


121 115 130 74 96 
129 132 118 105 94 
140 114 132 106 88 

118 106 104 

123 116 97 

111 99 103 

121 108 

113 96 


111 
121 


(a) Describe the mathematical model and the assumptions involved. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if the average IQ 
scores are equal for the five groups. Use a = 0.05. 

(d) Carry out the test for homoscedasticity ata = 0.05 by employing 


(i) Bartlett’s test, 
(11) Hartley’s test, 
(111) Cochran’s test. 


(e) Would you consider using contrasts? If so, perform the following 
and interpret your results: 


(1) Orthogonal contrasts, 


One-Way Classification 121 


(ii) Tukey’s procedure, 
(ii) Scheffé’s procedure. 


14. Hendy and Charles (1970) reported data on the silver content (% Ag) 
of a number of Byzantine coins discovered in Cyprus. There were 
nine coins from the first coinage of the reign of King Manuel I, 
Commenus (1143-1180); seven of the coins came from the second 
coinage minted several years later and four from the third coinage 
(still later); another seven were from a fourth coinage. The question 
of interest is whether there were significant differences in the silver 
content of coins minted early and late in King Manuel’s reign. The 
data are given as follow. 


First Coinage Second Coinage Third Coinage Fourth Coinage 


5.9 6.9 49 5.3 
6.8 9.0 5.5 5.6 
6.4 6.6 4.6 5.5 
7.0 8.1 4.5 5.1 
6.6 9.3 6.2 
7.7 9:2 5.8 
29: 8.6 5.8 
6.9 

6.2 


Source: Hendy and Charles (1970). Used with permission. 


(a) Describe the mathematical model and the assumptions involved. 
(b) Analyze the data and report the analysis of variance table. 
(c) Perform an appropriate F test to determine if the average silver 
contents are equal for the four coinage. Use a = 0.05. 
(d) Carry out the test for homoscedasticity ata = 0.05 by employing 
(i) Bartlett’s test, 
(1) Hartley’s test, 
(111) Cochran’s test. 


(e) Would you consider using contrasts? If so, perform the following 
and interpret your results: 


(1) Orthogonal contrasts, 
(1) Tukey’s procedure, 
(ii) Scheffé’s procedure. 


15. Anionwu et al. (1981) reported data on steady-state haemoglobin 
levels for patients with different types of sickle cell disease. The 
question of interest is whether the steady-state haemoglobin levels 
differ significantly between patients with different types. The date 

are given as follows. 


122 


16. 


The Analysis of Variance 


Type of Sickle Cell Disease 


HB SS HB S/-thalassaemia HB SC 


7.2 8.1 10.7 
TS 9.2 [3 
8.0 10.0 11.5 
8.1 10.4 11.6 
8.3 10.6 11.7 
8.4 10.9 11.8 
8.4 11.1 12.0 
8.5 11.9 12.1 
8.6 12.0 12.3 
8.7 12.1 12.6 
9.1 12.6 
9.1 13.3 
9.1 13.3 
9.8 13.8 
10.1 13.9 
10.3 


Source: Anionwu et al. (1981). Used with permission. 


(a) Describe the mathematical model and the assumptions involved. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if the average levels 
of haemoglobin are equal for the three types of sickle cell disease. 
Use a = 0.05. 

(d) Carry out the test for homoscedasticity ata = 0.05 by employing 


(i) Bartlett’s test, 
(11) Hartley’s test, 
(11) Cochran’s test. 
(e) Would you consider using contrasts? If so, perform the following 
and interpret your results: 
(i) Orthogonal contrasts, 
(11) Tukey’s procedure, 
(iii) Scheffé’s procedure. 
Sokal and Rohlf (1994, p. 237) reported data on the number of eggs 
laid per female per day for the first 14 days of life (per diem fecundity) 
for 25 females of each of three genetic lines of the fruitfly Drosophila 
melanogaster. The genetic lines to be labelled RS and SS were selec- 
tively bred for resistance and the susceptibility to DDT, respectively, 
and the line NS is a nonselected control strain. The purpose of the 


study was to investigate whether the two selected lines (RS and SS) 
differ in fecundity from the nonselected line; and whether the RS line 


One-Way Classification 123 


differs in fecundity from the SS line. The data are given as follows. 


Genetic Lines 


Resistant (RS) Susceptible (SS) Nonselected (NS) 
12.8 38.4 35.4 
21.6 32.9 27.4 
14.8 48.5 19.3 
23.1 20.9 41.8 
34.6 11.6 20.3 
19.7 22.3 37.6 
22.6 30.2 36.9 
29.6 33.4 37.3 
16.4 26.7 28.2 
20.3 39.0 23.4 
29:3 12.8 33.7 
14.9 14.6 29.2 
27.3 12.2 41.7 
22.4 23.1 22.6 
27.5 29.4 40.4 
20.3 16.0 34.4 
38.7 20.1 30.4 
26.4 23.3 14.9 
23:4 229 51.8 
26.1 22.9 33.8 
29.5 15.1 37.9 
38.6 31.0 29.5 
44.4 16.9 42.4 
23.2 16.1 36.6 
23.6 10.8 47.4 


Source: Sokal and Rohlf (1994, p. 237). Used with permission. 


(a) Describe the mathematical model and the assumptions involved. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test to determine if the average fecun- 
dity levels are equal for the three genetic lines. Use a = 0.05. 

(d) Carry out the test for homoscedasticity ata = 0.05 by employing 


(1) Bartlett’s test, 
(ii) Hartley’s test, 
(iu) Cochran’s test. 


(e) Would you consider using contrasts? If so, test the follow- 
ing contrasts and interpret your results: (1) RS + SS — 2NS 
(i) RS — SS. 


Two-Way Crossed 
Classification Without 
Interaction 


3.0 PREVIEW 


The major advantage of the one-way classification (one-factor design) discussed 
in the preceding chapter is its simplicity, which extends to the experimental 
layout, the model and assumptions underlying the analysis of variance, and the 
computations involved in the analysis. The major disadvantage of such a design 
is its relative inefficiency. The error variance will usually be large compared to 
that resulting from other designs. This is in part offset by the fact that no other 
design yields as many degrees of freedom for the error variance as does this 
design. 

In many investigations, however, it is desirable to measure response at com- 
binations of levels of two or more factors considered simultaneously. For ex- 
ample, we might desire to investigate blood pressure for different gender and 
ethnic groups, or to investigate weight loss comparing four diets among urban, 
suburban, and rural subjects according to their gender, or to investigate miles 
per gallon among five makes of automobiles for both city and country driv- 
ing. In investigations involving many factors, the effect of each factor on the 
response variable may be analyzed using one-way classification. Such an anal- 
ysis, however, will not be economical or efficient with respect to time, effort, 
and money. Moreover, such a procedure would give no information about the 
possible interactions that may exist among different factors. 

The theory of analysis of variance permits the investigation of several fac- 
tors or independent variables within the same experiment. Such a procedure 1s 
efficient, time saving, and, equally important, it permits the investigation of the 
joint effects of several factors or interactions between them. This and the next 
chapter deal with the statistical model and analysis of variance involving two 
factors such that every level of one factor included in the experiment occurs 
with every level of the second factor and vice versa. Such a layout is termed 
two-way crossed classification. Two-way crossed layout allows a researcher to 
examine fully the main effects of both factors and their interactions. The term 
crossed classification or classification comes from the fact that in many fields of 
investigation, the measurements or observations can be classified in the form of 


H. Sahai et al., The Analysis of Variance 125 


CY Ria et Gian Se as ace. Moles: INieker Vinee ONG 
© Springer Science+Business Media New York 2000 


126 The Analysis of Variance 


TABLE 3.1 
Data for a Two-Way Experimental Layout 
Factor B 
No B) By... By... By 
Ay 70 2 2 0 PT 
A2 yor Y22 wes OF wes YH 
Factor A 
Aj Yil Yi2 Yij Yib 
Aa Yat Ya2 +++ Yaj +++ Yab 


a two-way table where the rows of the table correspond to the levels of a factor 
and the columns to the levels of another factor. In a crossed classification it is 
customary to refer to the combinations of levels of the factors as cells rather 
than treatments. 


3.1 MATHEMATICAL MODEL 


Two factors are said to be crossed if the data contain observations at each 
combination of a level of one factor with a level of the other factor. Consider 
two factors A and B having a and b levels, respectively, and let there be exactly 
one observation in each of the a x b cells of the two-way layout. Let y;; be the 
observed score corresponding to the i-th level of factor A and the j-th level of 
factor B. The data involving the total of N = a x b scores y;;’s can then be 
schematically represented as in Table 3.1. 

The analysis of variance model for this type of experimental layout is given 
as 


yj=U+at+Pyte; G=1,2,...,a; j=l,...,b), (3.1.1) 


where —0o < 4 < OO is the overall mean, a; 1s the effect due to the i-th level 
of the factor A, B; is the effect due to the j-th level of the factor B, and e;; is 
the error term that takes into account the random variation within a particular 
cell. The model (3.1.1) states that the observed score yj;; corresponding to the 
(i, j)-th cell consists of the sum of the components: (1) the grand mean yp, 
(ii) the effect a; associated with the i-th level of factor A, (111) the effect B; 
associated with the j-th level of factor B, and (iv) an error term e;; which is 
strictly peculiar to the (7, 7)-th cell. 


Two-Way Crossed Classification Without Interaction 127 


3.2 ASSUMPTIONS OF THE MODEL 


Similar to the one-way classification model (2.1.1), the following assumptions 
are made in order to make inferences about the existence of effects in model 


(3.1.1). 


(1) 
(11) 


(iii) 


(iv) 


(v) 


The errors e;;’s are assumed to be randomly distributed with mean zero 
and common variance o7. 
The errors associated with any pair of observations are assumed to be 


uncorrelated; that is, 


0, fi, 4s’: 
(a 0 Pe ey 

E(ejjeij) = 0. i a ; Z - (3.2.1) 
oe ist, fas’ 


Under Model I, the effects a;’s and B ;’s are assumed to be fixed constants 
subject to the constraints 


a b 
>) a; = > B; =.(), 
I=] j=] 


This implies that the observations yj;;’s are distributed with mean pz + 
a; + B; and common variance o?. 

Under Model II, the effects a;’s and B;’s are assumed to be randomly 
distributed with zero means and variances o? and Op» respectively. Fur- 
thermore, @;’s, B;’s, and e;;’s are mutually and completely uncorrelated; 


that is, in addition to (3.2.1), the following relations hold 


E(qja;) =0, i Av; 
E(B;bj)) =0, JAW 
E(@,B;) = 0, all (i, 7)’s; 
E(aje;;) = 0, all(, j)’s: Cle) 
and 


E(Bjeij) = 0, all(i, j)’s. 


Then, from the model equation (3.1.1), we have 0? = 02 + Of + 02; 


and thus o2, Of, and o? are components of o, the variance of an 
observation. This implies that the observations yj;;’s are distributed with 
mean jz and common variance o2 + o% +o2. 

Under Model III, the effects a;’s are assumed to be fixed subject to the 
constraint )°;_, a@; = Oand the effects B;’s are assumed to be randomly 
distributed with mean zero and variance oR. Furthermore, as before, the 


B;’s are uncorrelated with each other and each of the B;’s and e;;’s are 


128 The Analysis of Variance 


also uncorrelated; that is, 


E(6;B,)=0, j # i’; 
and (3.2.3) 
E(B ;éi;) = 0, all (i, J)’Ss. 


In this case, = = 02 * + 07 and so oF and o? are components of oy 
ane the pu js are distributed with mean + a; and common arance 
Of + oa? ; 


Remark: Under the assumptions of the random effects model in (3.1.1), the obser- 
vations se the same level of the factor A have the correlation given by py = 

o;/(o; +03 +.0;). Similarly, the eos veiols within the same level of factor B have 
the ee ee given by pg = O; / (a? + oO; + a”). Under the assumptions of the mixed 
model in (3.1.1), the a camion within ae same level of the random factor B have 
the correlation given by pg = 03/(o; + 03). These correlations are referred to as the 
intraclass correlations. 


3.3. PARTITION OF THE TOTAL SUM OF SQUARES 


The total variation or total sum of squares with respect to model (3.1.1) is 
ee ea (yij — ee which can be partitioned as follows: 


a b a b 
(ii —9.P => WIG: -II+ 05 - 5) 
i=1 j=l i=1 j=l 
+ (yj — H.-F +H 
a b 
= Y2 i. y+ » Ye, 
=o 


b 
+> > 0-H. -— FG FI 


i=l j=1 


P b 
=b a (¥.-y. +a >> (Vj — 
i=1 j=l 


a b 

+2 Oy -— 5. — FF + 5." (3.3.1) 
i=] j=l 

where 
b a 
DS yij ye Yij 
c= = i=l 
Yi. = ‘ yj = : 


Two-Way Crossed Classification Without Interaction 129 


and 


The identity (3.3.1) is valid since all the cross-product terms are equal to zero. 
The first two sums of squares to the right of (3.3.1) measure the variation 
due to the a;’s and 6;’s, respectively, and the last one corresponds to the error 
e;; Ss. We use the notation SS,, SSz, and SS¢= to denote the sums of squares 
due to the a;’s, B;’s and e;;’s, respectively. The corresponding mean squares, 
obtained by dividing SS,4, SSzg, and SSz by (a— 1), (b—1), and (a— 1) (b— 1), 
respectively, are denoted by MS,4, MSz, and MSz,, respectively. Here, (a — 1), 
(b — 1), and (a — 1) (b — 1) are obtained by partitioning the total degrees of 
freedom ab — | into three components: due to the @;’s, B;’s, and e;;’s 


Remark: The three cross-product terms arising in equation (3.3.1) are: 


a b 
255 Gi. — FIG - 5.) =2 6 — ¥.. 305-3 =. 
i=l j=l 


= al. 5.66. aa 
— 0; 


a 


b 
pe 2, Gi — VMVF — Vi. — YG + Yd 
=206 =. ou-3 ey 


= 29751. — 5.66: i. —F.+5.) 


i=l 


and 


b 
29° 265 - FO — H.-F HH) 


i=l 


b 
=2) 65-5. Nou-3 ye EY) 


Se 
il 
— 


b 
= 29°65 - 9.05 — 9. - H+.) 


Thus, all cross-product terms are equal to zero. 


130 The Analysis of Variance 


3.4 MEAN SQUARES AND THEIR EXPECTATIONS 


Next, we examine the expectations of the mean squares. On taking successive 
averages of the model equation (3.1.1), we obtain 


y= uUtat+p +&, (3.4.1) 

yj =ew+a +B +é,, (3.4.2) 
and 

y =pta +f +é. (3.4.3) 


Substituting the values of y;;, ¥., yj, and y,, from (3.1.1), (3.4.1), (3.4.2), and 
(3.4.3), respectively, into the expressions for SS,4,SSzg, and SS¢e defined in 
(3.3.1), we find that 


a b 
SSe=)° > (ij — 6. -—8; +2), (3.4.4) 
=). 7=1 
b ‘eat 
SSp =a) (6; -—B +2; -2.), (3.4.5) 
j=l 
and 
SS, =b >» (a; -& +é —-2@Y. (3.4.6) 
i=l 


Now, because the e;;’s are uncorrelated and identically distributed with mean 
zero and variance @?, it follows that 


E(e;,) = 0;, (3.4.7) 

E(é) =o; /b, (3.4.8) 

E(@,) =o; /a, (3.4.9) 
and 

E(é*) = 0; /ab. (3.4.10) 


It is then a matter of straightforward computations to derive the expectations of 


Two-Way Crossed Classification Without Interaction 131 


mean squares. First, taking the expectation of (3.4.4.), we obtain 


a b 
2 
E(SSz)= )_ > Ee — %, — 2; +2.) 

i=1 j=l 


a b 
=) Ej) + £@) + £@%) + ER 
i=l j=1 


= 2E(e;;éi.) = 2E(e;é.;) =F 2E(e;;é..) 
+ 2E(@@;) — 2E@.é..) — 2E@ ;é@.)| 


] ] 
= abo? — 5 oe — —o? + | 
1 1 ] 
2 
a b)j1----+— 
a0 | boa ~ || 
= (ab—a—b+1)o? 
= (a — 1)(b— 1)o?. (3.4.11) 


The expectation of MSz, is, therefore, given by 


E(MSe) = £| = —| = 2 (3.4.12 
ens aay hij ee 


Note that the result (3.4.12) 1s true under the assumptions of fixed and random 
as well as mixed effects models. 

Now, to derive the expectations of MSz and MS,z, we consider the cases of 
Models I, II, and III separately. 


MODEL I (FIXED EFFECTS) 


Under Model I, the @;’s and £;’s are fixed quantities depending on the particular 
levels included in the experiment with the restriction that@ = B = 0. First, 
on taking the expectation of (3.4.5), we obtain 


b b 
E(SS) =a bs BR+EY (ej; - | (3.4.13) 
j=l 


j=l 


132 The Analysis of Variance 


by virtue of the fact that the B;’s are constant and the expectation of the cross- 
product term is zero. Now, using the results (3.4.9) and (3.4.10), we find that 


b 
mae (é.; -—@.) - Ee ) — bE(@?) 


y= 


= O°. (3.4.14) 
Furthermore, on substituting (3.4.14) into (3.4.13), we obtain 


b 
E(SSp)=a)_ Bi + (b— lop. (3.4.15) 


j=l 


Therefore, the expectation of MSz is given by 


E(MSz) = E( SSB “\=5 — 


(3.4.16) 


b— 


Similarly, from symmetry, it follows that the expectation of MS, is given by 


b = 2 Z 
E(MS,a) = ea OL +o. (3.4.17) 


MODEL II (RANDOM EFFECTS) 


Under Model II, the a;’s and £;’s are also randomly distributed with mean zero 
and variances o2 and Op, respectively. It then follows, using the formulae for 
the variances of the sampling distribution of the means of the @;’s and 6;’s, that 


E(a7) =o, (3.4.18) 

E(@) =o; /a, (3.4.19) 

E(B;) = 95, (3.4.20) 
and 

E(B’) =o} /b. (3.4.21) 


Now, on taking the expectation of (3.4.5), we get 


b b 
E(SSg) =a E \ (Bj -BY +E Ej - “| (3.4.22) 


Two-Way Crossed Classification Without Interaction 133 


since the expectation of the cross-product term is zero. Furthermore, using the 
results (3.4.20) and (3.4.21), we find that 


b b 
ES (8; — BY = 9~ E(B}) — bE(B’) 
j=l j=l 


| 
a 2 2 
_ bo, — ee: 
= (b— l) og. (3.4.23) 


On substituting (3.4.14) and (3.4.23) into (3.4.22), we obtain 
E(SSg) =a C —1l)og + (6-1) 0 
= a(b — 1)og + (b — 1)op. (3.4.24) 
Therefore, the expectation of MSz is given by 
E(MS3) = e(—* = aa, +0). (3.4.25) 
Similarly, from symmetry, it follows that the expectation of MS, is given by 


SS 
E(MS,) = E( e 
a —' 


| = bo? +07. (3.4.26) 


MODEL II! (MIXED EFFECTS) 


Under Model III, the w;’s are fixed quantities and the 8;’s are randomly dis- 
tributed with mean zero and variance o7. Furthermore, the B;’s are uncorrelated 
with each other and each of the 8 ;’s and e;;’s are also uncorrelated. Therefore, 
using the results (3.4.17) and (3.4.25), it follows that the expectations of MS, 
and MS, are given by 


E(MSg) = aog +0; 


and 
E(MS,) = = Sa? he, 
C= | a I e 


The foregoing results of Sections 3.3 and 3.4 can now be summarized in a 
tabular form as the analysis of variance table shown in Table 3.2. 


TABLE 3.2 
Analysis of Variance for Model (3.1.1) 


Expected Mean Square 


Source of Degrees of Sum of Mean 
Variation Freedom Squares Square Model I Model II Model III 
b b 
Due to A a—| SS4 MS, of + a a? a, + bo2 of + Aa a? 
a7" i=l @~ * j= 
a 
Due to B b—1 SSB MSp ae + bo 1 26; oa? + ao%, oa? + aos, 
j= 
Error (a—1)(b-1) SSE MS_ ae oa? oa? 
Total ab — | SSr 


VEL 


aouene jo siskjeuy ayy 


Two-Way Crossed Classification Without Interaction 135 


3.5 SAMPLING DISTRIBUTION OF MEAN SQUARES 


It is important to recognize that in the derivations of the results on expected mean 
squares given in the preceding section, we have not made any distribution as- 
sumptions for e;;’s under Model I; for a;’s, B;’s, and e;;’s under Model II; and for 
B;’sand e;;’s under Model III. However, to derive the form of their sampling dis- 
tributions, we require the assumption of normality for the random components 
of model (3.1.1). Thus, under Model I, we assume that the e;;’s are independent 
and normal random variables with mean zero and variance o2. Under Model 
II, all a;’s, B;’s, and e;;’s are mutually and completely independent normal 
random variables with mean zero and variances a2, oR, and Or: respectively. 
Finally, under Model III, the a;’s are constants subject to the restriction that 
>-;-1 & = 0, and the B;’s and e;;’s are mutually and completely independent 
normal random variables with mean zero and variances o? and o2, respectively. 

In the following we give the results on sampling distributions of mean squares 
for fixed, random, and mixed effects models. The derivation of these results is 
beyond the scope of this volume and can be found in Scheffé (1959), Graybill 
(1961), and Searle (1971b). 


MODEL | (FIXED EFFECTS) 
Under the distribution assumptions of Model I, it can be shown that: 


(a) The quantities MS;, MSz, and MS, are statistically independent. 
(b) The following results are true: 


(1) 
20 — = 
— _ xXta—)O DI (3.5.1) 
o: (a — 1)(b—1) 
(ii) 
MSp x~[b—1,Agz] 
5 = (3.5.2) 
and 
(iii) 
2/ _ 
Bea, (3.5.3) 


2 = 
0; a—l 


where, as usual, x7[.] denotes a central and y~[. , .] denotes a noncentral chi- 
square variable with respective degrees of freedom, and the noncentrality para- 
meters Ag and A, are defined by 


a 2 
he = 55 DB: 


e j=l 


136 The Analysis of Variance 


and 


Di tx 7 
AA= a. 
: 202 oe ; 


It, therefore, follows from (3.5.2) and (3.5.3) that 
(ii)’ If 8; = 0, forall 7, then 


MS, x7[b—1] 


3.5.4 
a? b—1 ( ) 
aii)’ Ife; = 0, for alli, then 
MS *la—1 
a Ale 1h) (3.5.5) 


oa? a—l| 
MODEL II (RANDOM EFFECTS) 
Under the distribution assumptions of Model II, it can be shown that: 


(a) The quantities MS-, MSz, and MS, are statistically independent. 
(b) The following results are true: 


(1) 
MSe  x’[(a—1)(6- 1)] 

gamer oon ae ee (3.5.6) 

(11) 
MSz x*[b- 1] 
and 
(111) 
217 — 
BAe (3.5.8) 


o2+ be? a-—1 


That is, the ratio of MS¢ to a? is a x*[(a — 1)(b — 1)] variable divided by 
(a — 1)(b — 1), the ratio of MSz to oa? + ao; is a x7[b — 1] variable divided 
by b — 1, and the ratio of MS, to a; + bo? is a x7[a — 1] variable divided 
bya —l. 


MODEL III (MIXED EFFECTS) 
Under the distribution assumptions of Model III, it can be shown that: 


(a) The quantities MS;, MSz, and MS, are statistically independent. 


Two-Way Crossed Classification Without Interaction 


(b) The following results are true: 


(i) 
MSz — x2[(a-DO- I) 
oa? (a—1)(b—-1) ’ 
(11) 
MS, x’*[b — 1] 
oe + aoz (b—1) ’ 
and 
(111) 
MS, x”[a—1,Aa] 
o? a—1l 
where 
b a 
i= a 
‘a 2a? = “ 


It follows from (3.5.11) that if w; = 0, for all i, then 
(111) 


MS, x7 [a — 1] 
a? on 


3.6 TESTS OF HYPOTHESES: THE ANALYSIS 
OF VARIANCE F TESTS 


137 


(3.5.9) 


(3.5.10) 


(3.5.11) 


(3.5.12) 


In this section, we present the usual hypotheses about the effects of A and B 
factors and the appropriate F tests for fixed, random, and mixed effects models. 


MODEL | (FIXED EFFECTS) 
Under Model I, the usual hypotheses of interests are: 
Hy :all B;’s =0 
versus 


B e ? ° 
H, : not all B;’s are zero; 
and 


Hp : all a;’s = 0 
versus 
H@ : not all «;’s are zero. 


(3.6.1) 


(3.6.2) 


138 The Analysis of Variance 


In order to develop test procedures for the hypotheses (3.6.1) and (3.6.2), we 
note from (3.4.12), (3.4.16), and (3.4.17) that when the null hypotheses H} 
and H;" are true, we have 


E(MS_) = o;, 

E(MS3) = oy, 
and 

E(MS,) = 02, 


that is, MSz, MSz, and MS, are unbiased estimates of the same quantity a2. 
It then follows, from (3.5.1), (3.5.4), and (3.5.5), that 


MS; /o? - MSz3 


ose ie ~ F[b—1,(a—1)(b—1)] (3.6.3) 


FR 


and 


MS,/o;  MSa 


ee Wigeeh ise Ot (3.6.4) 


Therefore, the statistics (3.6.3) and (3.6.4) provide suitable test procedures 
for testing hypotheses (3.6.1) and (3.6.2), respectively. Thus, H;? is rejected at 
the a-level of significance if 


Fp > F[b-—1,(a—1)(b—-1);1-—a]. (3.6.5) 
Similarly, H¢ is rejected at the a-level of significance if 
F,a > Fla—-1,(a—1)(b—-1);1—-a]. (3.6.6) 


It should be noted, however, that when the null hypotheses Hy and Hy} are not 
true, it follows from (3.5.1) through (3.5.3) that 


MS=z 


Tl F'[b — 1, (a — 1)\(b — 1); Ag] (3.6.7) 
and 

Le 2 —~1\(b—1):A 3.6.8 

MSz a (a )¢ ); Al, ( VU. ) 


where F’ [., .; .] denotes a statistic having a noncentral F distribution with res- 
pective degrees of freedom and the noncentrality parameters Ag and 4 defined 
by 


AR = . B° 
== 
2 J 
20; jal 


Two-Way Crossed Classification Without Interaction 139 


and 


ha = o25 oa? 
= as. 
“20 = 


The results (3.6.7) and (3.6.8) are employed in evaluating the power of these F 
tests in Section 3.11. 


Sn 


MODEL II (RANDOM EFFECTS) 


Under Model II, testing significance of the effects of a factor is equivalent to 
testing the hypothesis that the corresponding variance component is zero. Thus, 
the usual analogues of hypotheses (3.6.1) and (3.6.2) are: 


Hy [Op =0 versus H? [Op > 0 (3.6.9) 
and 
Hj':02 =0 versus Hi’ :02>0. (3.6.10) 


It can be readily seen that the statistics (3.6.3) and (3.6.4), obtained in the case 
of Model I, also provide suitable test procedures for the hypotheses (3.6.9) and 
(3.6.10), respectively. However, under the alternative hypotheses, the aforesaid 
Statistics have a (central) F distribution rather than a noncentral F as in the 
case of Model I, a fact which greatly simplifies the computation of power under 
Model II. 


MODEL III (MIXED EFFECTS) 


Under Model III, the hypotheses of interest are: Op = 0 anda;’s = 0. Again, it 
can be seen that the test statistics (3.6.3) and (3.6.4) developed earlier are also 
applicable for these hypotheses. 


3.7 POINT ESTIMATION 


In this section, we present results on point estimation for parameters of interest 
under fixed, random, and mixed effects models. 


MODEL I (FIXED EFFECTS) 


In the case of the fixed effects model, the least squares estimators! of the para- 
Meters j, a;’s, and 6;’s are obtained by minimizing the residual sum of squares 


a b 
0=)) > Oy -H-% - BY, (3.7.1) 
i=1 j=1 


! The least squares estimators in this case are the same as those obtained by the maximum likelihood 
method under the assumption of normality. 


140 The Analysis of Variance 


with respect to 4, @;’s, and 6;’s; and subject to the restrictions: 


a b 
yo a; = \ > Bj = 0. (3.7.2) 
i=1 j=l 


The resultant estimators are obtained to be: 


fii=y., (3.7.3) 

Gi = Yi. — Y., [ee ere 7 (3.7.4) 
and 

Bj =9j;-¥.. f=1,2,...,6. (3.7.5) 


These are the so-called best linear unbiased estimators (BLUE). The variances 
of the estimators (3.7.3) through (3.7.5) are: 


Var( jt) = a; /ab, (3.7.6) 

Var(@;) = (a — 1) 0; /ab, (3.7.7) 
and 

Var(B ;) = (b — 1)02 /ab. (3.7.8) 


The other parameters of interest are: ps + a; (mean levels of the factor A), ps + 
B; (mean levels of the factor B), pairwise differences a; — oj and B; j i and 


the contrasts of the type )\;_, €iai()5;_, £; =0) and ae Bi (Q)j- _, & =0). 
Their respective estimates together with the variances are given by 


e+ oy = Ji, Var(ji + a;)=02/b; 3.7.9) 
n+ B; = Vj, Var(p1 + B;)=02/a: (3.7.10) 
a; — a =F. — Fi) Vary — a) = 202 /b; (3.7.11) 
b= 6s =yj;—y;, Var(B; — B Bj) =202 /a; (3.7.12) 


Sha = 3-45, var) =)> eo o7/b; (3.7.13) 
i=] i=] i=] 


=] 


and 


ioe b b b 
Debi = DL G3.i> ver Gi) =D e202 (3.7.14) 
j=l 


j=l j=l = 


Two-Way Crossed Classification Without Interaction 141 


The variance a? is, of course, estimated unbiasedly by 


6? =MSz. (3.7.15) 


MODEL II (RANDOM EFFECTS) 


In the case of the random effects model, the variance components may be 
estimated by the analysis of variance method, that is, by equating the observed 
mean squares in the lines of the analysis of variance table to their respective 
expected values and solving the equations for the variance components. The 
resulting estimators are: 


6? =MSz, (3.7.16) 

55 = — (MSp —MSz), (3.7.17) 
and 

6) = ~ (MS, —MSz). (3.7.18) 


It can be shown that these estimators are also the maximum likelihood estimators 
(corrected for bias) of the corresponding parameters. The parameter pj is, of 
course, estimated by 


h=y., (3.7.19) 


as in the case of the fixed effects model. The remarks concerning the negative 
estimates and the optimal properties of the analysis of variance estimators made 
in Section 2.9 also apply here.” 


MODEL III (MIXED EFFECTS) 


In the case of a mixed effects model, with A fixed and B random, the usual 
parameters of interest are j1, a;’s, 07, and o2. The corresponding estimators are: 


p= ¥., (3.7.20) 

CpG. PS 2 eget (3.7.21) 

ig = - (MSp —MSz), (3.7.22) 
and 

6? = MSz. (3.7.23) 


2 For a discussion of the nonnegative maximum likelihood as well as other nonnegative estimation 
procedures and their properties, the reader is referred to Sahai (1974b) and Sahai and Khurshid 
(1992). 


142 The Analysis of Variance 


3.8 INTERVAL ESTIMATION 


In this section, we present results on confidence intervals for parameters of 
interest under fixed, random, and mixed effects models. 


MODEL | (FIXED EFFECTS) 


Using the distribution theory of the error mean square; that is, 


MSe  x’[(a-1)@—-1)] 


; 3.8.1 
o2 (a —1)(b—-1) ( 
a 100 (1 — @) percent confidence interval for ao? is obtained as 
—1)(b-1)MS — 1)(b—1)MS 
(a — 1)( )MSe 2 (a — 1)( )MSzE (3.8.2) 


aay ae ee eee OO, ee ee a eee 
x*[(a—-1)(6-1),1-a@/2] x*[(a- 1) (6-1), @/2] 


Also, it is possible to construct confidence intervals using the ¢ distribution for 
a pairwise difference a; — a; or the contrast )°;_, £:a@;, where )-j_, £; = 0. 
To obtain the confidence limits for w; — a;,, we note from (3.7.11) that 


ee 
Aj; — Ay = Yi. — Yi? 


with 
Var(¥i. — ¥i7) = 202 /b. 
The confidence limits on a; — a; can, therefore, be derived from the relation 


(Vi. — Vir) — (i — a) 


J2MS «/b 


Thus, the corresponding 100(1 — a) percent confidence limits for a; — a; are 
given by 


~t[(a—1)(@—))]. 


(Vi. — Yi) Eta —1)(—- 1),1—a@/2] f2MSe/5. 


Similarly, confidence limits for the contrast )°;_, €:;a@; can be obtained from 
the relation 


a a 
2 Lyi. — > Lia; 
j=] rat 


[MS ¢ AL 
i=] 


~tia—-)®—-1))), 


Two-Way Crossed Classification Without Interaction 143 


which yields 


45. - M < Yb < 45; +Mit=1-da, 
i=l 


i=l 


where 


M =t[(a—1)(6—-1),1-—a/2] 


MSz S°e/b. 
i=] 


Similar results hold for any pairwise difference B; — Bj i") OF the contrast 
ye 1 £5 B; Os. j<1 ©; = 0). However, for the reasons given in Section 2.19, the 


multiple comparison procedures discussed in Section 3.12 should be preferred. 


MODEL II (RANDOM EFFECTS) 


Exact confidence intervals for 07, 02 +a or o2 + bo?, variance ratios OR [oe 
ando2/o2, and proposiens of variances o2/(02 +03), eG te): 02 /(o2+ 

op), and o2 j (a2 + a2) can be obtained by using the results on the mapline 
distabucon for mean squares. In particular, the probability is 1 — @ that the 
interval 


—— 
a\MSz F[b—1,(a—1)(b—1);1—a/2] 


1 (MS, l , 
a Gs F[b-1,(a—)(—1);@/2] — )| 


captures oF /o2. However, as before, exact confidence intervals for o7 and OR 
do not exist. The approximate procedures available in the case of one-way 
classification are also applicable here.’ 


MODEL III (MIXED EFFECTS) 


The objective in this case is to set confidence limits on the variance components 


o2, 07, and on fixed effects a;’s. The limits ono? are the same as given in (3.8.2). 


Again, a exact aa for oF is not available, but one can set exact limits on 
a7 + a0, 2 and ao? 3/0; o;. Similarly, one can obtain confidence intervals for a pair- 
auice eens a; — a; or the contrast ee £30; Os £; = 0). Approximate 


3 For some results on approximate confidence intervals for the variance components o2 and o2, 
and the total variance oa? +o24 o2, including a numerical example, see Burdick and Graybill 
(1992, pp. 126-128). The problem of setting confidence intervals on the proportions of variability 
of (of +05 +04), og /(o, +0% +04), and og /(of +03 +0) has been considered by Arteaga 
et al. (1982). For a concise summary of the results, including a numerical example, see Burdick 
and Graybill (1992, pp. 129-132). 


144 The Analysis of Variance 


confidence intervals for Fe and a single factor A level mean pz + a; can be 
determined using the Satterthwaite procedure (see Appendix K). For some ad- 
ditional results including numerical examples, see Burdick and Graybill (1992, 
pp. 154-156). 


3.9 COMPUTATIONAL FORMULAE AND PROCEDURE 


As in the case of one-way classification, the sums of squares SS7,SS,, SSz, 
and SS- can be expressed in more computationally suitable forms. These are: 


a b 2 
y 
SS; = ed 3.9.1 
T = 2 ab (3.9.1) 
a 2 2 
Yi y 
54 — is, 3.9.2 
A Le ab ( ) 
b 2. 9) 
ee ee (3.9.3) 
— a ab 
7=1 
and 
a b a 2 b 2 2 
Ma Neg vg. es 
SSe= >>) yi - rae gee (3.9.4) 
=e i=l ja. 
where 
b | a 
yi = > vis, yi => vy, 
J=1 i=] 
and 


b 


.) 


a 
nes 
i=l 


The error sum of squares SS¢ is usually calculated by subtracting SS, + SSz 
from SS7; that 1s, 


Se se So viz. 
1 


SSe = SS7 — SS, — SSz. (3.9.5) 


The computational procedure for the sums of squares can be performed in 
the following sequence of steps: 


(i) Sum the observations for each row to form the row totals: 


Yi1.5 2.5 eee Ya.- 


Two-Way Crossed Classification Without Interaction 145 


(ii) Sum the observations for each column to form the column totals: 


Vly» Y.20+++s Vibe 


(iii) Sum all the observations to obtain the overall or grand total: 


a b 
= ee 
i=l f=1 


(iv) Form the sum of squares of the individual observations to yield: 


a b 

2 2 2 2 
Y> > yi = Yip TYj2 +++ + ap: 
i=1 j=1 


(v) Form the sum of squares of the totals for each row and divide it by b to 
yield: 


Yo y/o. 
i=l 


(vi) Form the sum of squares of the totals for each column and divide it by 
a to yield: 


b 
dy /4. 
j=l 


(vii) Square the grand total and divide it by ab to obtain the correction factor: 
as 
ab 


Now, the required sums of squares, SS7, SS,4, SSg, and SS¢ are obtained by 
using the computational formulae (3.9.1), (3.9.2), (3.9.3), and (3.9.4) or (3.9.5), 
respectively. 

It is expected that most investigators would use computers in the handling of 
analysis of variance calculations. Otherwise, an electronic calculator is highly 
recommended, especially for a large data set. Such calculators have the addi- 
tional advantage that the totals and sums of squares can be determined at the 
same time providing a check on the previous calculation. 


3.10 MISSING OBSERVATIONS 


In the analysis of variance discussed in this chapter, it 1s assumed that there 
is exactly one observation in each cell of the two-way layout as shown in 
Table 3.1. However, in the process of conducting an experiment, some of the 


146 The Analysis of Variance 


observations may be lost. For example, the experimenter may fail to record an 
observation, animals or plants may die during the course of the experiment, or 
a subject may withdraw before the completion of the experiment. In such cases 
an approximate analysis discussed here may be used. The method consists of 
inserting estimates of the missing values and then carrying out the usual analysis 
as if no observations were missing. The estimates are obtained so as to minimize 
the residual or error sum of squares. However, care should be exercised in not 
including these estimates when computing the relevant degrees of freedom. 
Thus, for every missing value being estimated, the degrees of freedom for the 
residual mean square are reduced by one. 

Suppose the observation corresponding to the i-th row and the j-th column is 
missing and let it be denoted by y;;. Then all the sums of squares are computed in 
the usual way except, of course, that they all involve y;;. Itis then an elementary 
calculus problem to show that the value of yj; which minimizes the error sum 
of squares is given by 


by’, + ay; — y! 
(a—1)(b—1)’ 


A 


where y; denotes the total of b — 1 observations in the i-th row, y’, denotes 
the total of a — 1 observations in the j-th column, and y’ denotes the sum of 
all ab — 1 observations. The mathematical derivation of the formula (3.10.1) 
may be found in an intermediate or advanced level text (see, e.g., Peng (1967, 
pp. 109-110); Montgomery (1991, pp. 148-151); Hinkelmann and Kempthorne 
(1994, pp. 266-267)). If $;; obtained from (3.10.1) is substituted for the missing 
value, then SS,, SSg, and SS_ can be computed in the usual way. 

The formula (3.10.1) was first discussed by Allen and Wishart (1930). The 
F test will be slightly biased and the reader is referred to the paper by Yates 
(1933) for a discussion on this point. Also, it can be shown that when there is 
a missing observation in the i-th row, 


a | a i 
Var(5i.) = F eae |<: (3.10.2) 


and 


_ _. 42 a 3 
Var(yi. — Yi) = F ipa = Gano >| Of. (3.10.3) 


The expression (3.10.2) can also be written as 


Var(; -[t+ oo Fe (3.10.4) 
ar( Yi.) = soma |e 10. 


Note that the expression (3.10.4) is slightly greater than o02/(b — 1) and the 


Two-Way Crossed Classification Without Interaction 147 


variance of a row mean with no missing value is o2/b. Furthermore, the error 
mean square is estimated correctly but the mean squares for factors A and B 
are somewhat inflated. To correct for this bias the quantity 


[y’; — (a — 1) Si 


3.10.5 
a(a—1) ( 


is subtracted from the factor A mean square. Similarly, to test for the factor B 
effects, the quantity 


[yi — (b— 1) ij}? 


3.10.6 
b(b—1) ( 


is subtracted from the factor B mean square. 

If there are two missing values, one can either repeat the foregoing proce- 
dure with two simultaneous equations obtained by minimizing the error sum 
of squares with respect to two missing values, or one can obtain an iterative 
procedure of guessing one value, and fitting the other by formula (3.10.1), then 
going back and fitting the first value, and so on. When there are several miss- 
ing values, one first guesses values for all units except the first one. Formula 
(3.10.1) is then used to find an initial estimate of the first missing value. With 
this initial estimate for the first one and the values guessed for the others, the 
formula (3.10.1) is again used to obtain an estimate of the second value. The 
process is continued in this manner to obtain estimates for the remaining val- 
ues. After completing the first cycle of the initial estimates, a second set of 
estimated values is found and the entire process 1s repeated several times until 
the estimated values are not different from those obtained in the previous cycle. 
The details may be found in Tocher (1952), Bennett and Franklin (1954, p. 
382), Cochran and Cox (1957, pp. 110-112), and Steel and Torrie (1980, pp. 
211-213). Healy and Westmacott (1956) gave a more general iterative method 
using a program that analyzes complete data rather rapidly, and Rubin (1972) 
presented a noniterative method. For m missing values, a computer program 
is usually required that will invert an m x m matrix. The general problem of 
missing data can usually be dealt with much more efficiently using an algo- 
rithm developed by Dempster et al. (1977). For further discussion of this topic 
the reader is referred to Anderson (1946), Dodge (1985), and Snedecor and 
Cochran (1989, pp. 273—278).4 For a discussion of correction for bias in mean 
squares for factors A and B when two or more observations 1n a row or column 
are missing, see Glen and Kramer (1958). | 

It should be remarked that the use of estimates for missing values does not in 
any way recover the information that is lost through the missing data. It is merely 
a computational procedure to enable the experimenter to make an approximate 


4 Hoyle (1971) gives an introduction to spoilt data (missing, extra, and mixed up observations) with 
an extensive bibliography. Afifi and Elashoff (1966, 1967) in a two-part article have considered 
the problem of missing data in multivariate statistics. 


148 The Analysis of Variance 


analysis. It is important that the investigator examine carefully the nature of the 
missing values. If the reasons for missing values can be attributed to chance, 
the treatment comparisons based on the remaining values will be unbiased, and 
the methods described previously may be generally applied. It is worthwhile 
to remember that the analysis of variance does not take lightly to missing data 
and utmost caution should be exercised to ensure that no observation is lost. 
The situation is best summed up by Cochran and Cox (1957, p. 82) when they 
state: “... the only complete solution to the ‘missing data’ problem is not to 
have them....” 


3.11 POWER OF THE ANALYSIS OF VARIANCE F TESTS 


The discussion on the power of the analysis of variance F test for the one-way 
classification given in Section 2.17 also applies here. Thus, under Model I, it 
follows from (3.6.5) and (3.6.7) that the power of the F test for the hypothesis 
on B;’s is given by 


MSp 
Power = P 
MS 


> F[b—1,(a—1)(b—1);1—a] |B; > 0 
A 


for at least one j 


= P{F'[b-1,(a—1)6- 1); $a] 
>F[b —1,(a — 1)(b— 1);1- a}, (3.11.1) 


where 


Similarly, for the hypothesis on a@;’s, we obtain 


Power = P{F'[a — 1, (a — 1)(b— 1); ba] 
>F[(a — 1), (a — 1)(b — 1); 1 -—a}}, (3.11.2) 


where 


The expressions (3.11.1) and (3.11.2) can be evaluated by using noncentral F 
tables or Pearson-Hartley charts as described in Section 2.17. 


Two-Way Crossed Classification Without Interaction 149 
Under Model II, the power of the test for the hypothesis 
Ho : 03 =(Q versus A, 8 > 0 


is given by 


at > F[b-1,(a—1)(b- 1);1-a]|o3 > 0) 


- fre l,(a—1)(b—-1)] 


Oo 


9\ -1 
>(140%) FP 1.@= Dita} (3.11.3) 


Likewise, the power of the test for the hypothesis 
Ho:02=0 versus H;:02>0 


is given by 


Pt Fi —1,(a—1)(b—1)] 


o\ 71 
-( ns res) Pa=1. Gabi = al (3.11.4) 
Powers of the tests corresponding to the more general hypotheses of the type 
03/0; < pcan also be obtained similarly. 

Under Model III, the power of the test for B; effects involves the central F 
distribution and for a; effects involves the noncentral F distribution. The power 
results for Op are then the same as given in (3.11.3) and for @;’s the results are 
given by (3.11.2). 


3.12 MULTIPLE COMPARISON METHODS 


The results on multiple comparisons discussed in Section 2.19 are also appli- 
cable here with a few minor modifications. The procedures can be utilized for 
the fixed as well as the mixed effects models. For the fixed effects case, as we 
have seen in Section 3.8, the contrasts of interest may involve a;’s or B;’s and 
will be of the form 


L = liq, + €202 + ---+ Lag 


150 The Analysis of Variance 


or 
L' = €,Bi + £,B.+---+£,Bp, 


where 


The estimates of L and L’ are given by 


aA 


L = €,y1, + 292, +--+ + laa. 
and 
LE’ = 91 +452 +---+ G50. 


The results on Tukey’s and Scheffé’s methods are the same as given in Sec- 
tion 2.19 except that now MSz replaces MSw and (a — 1) (6 — 1) replaces 
a(n — 1) in degrees of freedom entries. Furthermore, MSg will be replaced by 
MS, or MSz depending upon whether the inferences are sought on @;'s or B;’s. 
Thus, for example, if Tukey’s method is used, L is significant at the a-level if 


i/ | Vi (22 «lt > qla,(a—1)(b—1)31 a]. 
i=] 


Similarly, if Scheffé’s method 1s used, L is significant at the a-level if 


: 1/2 
i/\e = pase 34/6] 
i=] 


> {F [a —1,(a —1)(b—-1);1-a}}!”. 


Likewise, the significance of the contrast L’ can be tested. The other multiple 
comparison procedures can also be similarly modified. 

For a single pairwise comparison, one can use J/2t [(a—1)(b—1), l-—a /2] 
instead of T = g[a, (a—1)(b—1); 1—a@]. Fora limited number of pairwise com- 
parisons, the Bonferroni method can be employed by using J2t [(a — 1) (b—1), 

1 — a /2k] instead of T. 

Under Model III, the contrasts of interest involve only the a;’s and the results 

are identical to those given previously.° 


> Fora general discussion of multiple comparison methods in a two-way layout, see Hirotsu (1973). 


Two-Way Crossed Classification Without Interaction 151 


TABLE 3.3 
Loss in Weights Due to Wear Testing of 
Four Materials (in mg) 


Position 
Material 1 2 3 
] 241 270 274 
2 195 241 218 
3 235 273 230 
4 234 236 227 


3.13 WORKED EXAMPLE FOR MODEL | 


Consider the following example described in Davies (1954). An experiment was 
carried out for wear testing of four materials. A test piece of each material was 
extracted from each of the three positions of a testing machine. The reduction 
in weight due to wear was determined on each piece of material in milligrams 
and the data are given in Table 3.3. 

It is desired to test whether there are significant differences due to different 
materials and machine positions. Clearly, the data of Table 3.3 should be ana- 
lyzed under Model I, since the four materials and the three positions of testing 
machines are especially chosen by the experimenter to be of particular inter- 
est to her and thus will both have systematic effects. The mathematical model 
would be 


Yij =pt+a;+ Bj + ei; GS 1,2, 3,4; —— by 29) 


where jz is the grand mean, q; is the effect of the i-th material, 8; is the 
effect of the j-th position, and the e;;’s are random errors, with ar a; =0, 
ee B; =0, and e;, ~ N(O, a2). It is further assumed that no interaction 
between the material and position is likely to exist. 

To perform the analysis of variance computations, we first obtain the row 
and column totals as 


yi, = 785, yo, = 654, y3, = 738, yg, = 697; 
y; = 905, y2=1,020, y3 = 949; 


and the grand total is 


y. = 2,874. 


152 The Analysis of Variance 


The other quantities required in the calculations of the sums of squares are: 


2 2874)* 
Yn _ BBIAY _ egg 393 
ab 4x3 


L< 1 
= Di = zl(785)" + (654)" + (738)° + (697)"] = 691,464.67, 
i=1 
IZ 1 
a Dose lies vs 2 Dy 
95 = 7105)? + (1,020)? + (949)"] = 690,006.500, 
a 
j=l 
and 


b 
Y yz = (241)? + (270) + +++ + (227)? = 694,322. 


The resultant sums of squares are, therefore, given by 


a b 2 
_ De ne ee tia = 
SS; = = ) , Vii — a 694,322 — 688,323 = 5,999.000, 
(= j= 


Lee y? 
SS,=- 2 _ == — 691,464.667 — 688,323 = 3,141.667, 
A b De Yi. ab 
[eee y? 
SSp = — > y*, — = = 690,006.500 — 688,323 = 1,683.500, 
a =| ab 
and 
SSr = SS7 — SS, — SSp = 5,999.000 — 3, 141.667 — 1,683.500 


= 1,173.833. 


Finally, the results of the analysis of variance calculations are summarized in 
Table 3.4. 

If we choose the level of significance a2 = 0.05, we find from Appendix 
Table V that F [3, 6; 0.95] = 4.76 and F [2, 6;0.95] = 5.14. Comparing these 
values with the computed F values given in Table 3.4, we do not reject the 
hypothesis of no “position” effects (p = 0.069), but reject the hypothesis of no 
“material” effects (p = 0.039). That is, we may conclude that there is probably 
a significant difference due to the materials but not due to positions. 

To determine which materials differ, we use Tukey’s and Scheffé’s procedures 
for pairwise comparisons. For the Tukey’s procedure, we find from Appendix 
Table X that 


gla, (a — 1)(6-1);1-—a@] = q [4, 6;0.95] = 4.90. 


Two-Way Crossed Classification Without Interaction 153 


TABLE 3.4 

Analysis of Variance for the Weight Loss Data of Table 3.3 

Source of Degreesof Sum of Mean Expected 

Variation Freedom Squares Square Mean Square FValue _p-Value 
3 34 

Material 3 3,141.667 1,047.222 of + 7 Sia? 5.35 0,039 
dsl 
4 3 

Position 2 1,683.500 841.750 of +z— DB; 4.300.069 
-1 

Error 6 1,173.833 195.639 oa? 

Total 1] 5,999.000 


Now, the pairwise differences of sample means for materials would be com- 


pared to 
195.639 
= 4,90 5 = 59.57. 


The four sample means for the materials are 


q{a,(a—1)(b—1);1—a@] 


yy, = 261.67, yo, = 218.00, 3, = 246.00, ya. = 232.33, 
and there are six pairs of differences to be compared. Furthermore, 


l¥1. — Yo,.| = 43.67 > 39.57, |). — y3,.| = 15.67 < 39.57, 
l¥i. — y4.| = 29.34 < 39.57, |¥2, — y3,.| = 28.00 < 39.57, 
| Yo. = Ya. | = 14.33 < 39.57, | ¥3. = Ya. | = 13.67 < 39.57. 


Hence, we may conclude that materials one and two are probably significantly 
different but not the others. 
For the Scheffé’s procedure, we find from Appendix Table V that 


S? = F[a—1,(a—1)(b—1);1—a] = F [3, 6;0.95] = 4.76. 


Furthermore, for the contrasts consisting of the differences between two means, 
we have 


oie 
* SS) 


ce 


154 The Analysis of Variance 


TABLE 3.5 
Number of Minutes Observed in Grazing 
Animal 
Observer 1 2 3 4 5 6 7 8 9 10 


34 76 %7 31 #61 82 82 67— 72 38 
33 76 «72 = 29 «66006 6820CO84 OTs. 36 
35 78 76 30 65 86 88 66 £76 37 
34 77 71 29 60) «78 683667 C72 37 
33 77 = 70 2706 6559 81 82S TSs 70 33 


Mm & WN — 


Source: John (1971, p. 68). Used with permission. 


So that the differences among the sample means will now be compared to 


2 | 2 
S2 (a — 1)MS¢ cae = /4.76 (3) (195.639) (=) = 43.16. 


Here, again, we may conclude that materials one and two are probably sig- 
nificantly different but not the others. However, evidence from the Scheffé’s 
method is not as strong as that from the Tukey’s method. The significance of 
any other contrasts of interest can also similarly be evaluated. 


3.14 WORKED EXAMPLE FOR MODEL II 


The following example is based on a study on the grazing habits of Zebu cattle 
in Entebbe, Uganda. A group of 10 cattle was observed and recorded every 
minute they were grazing. A group of five observers was chosen and the same 
group was used during the entire experiment. They followed a group of cattle 
in the same paddock for 88 minutes during one afternoon. The data given in 
Table 3.5 are taken from John (1971, p. 68) and represent the number of minutes 
in which observer i (i = 1,...,5) reported animal j (7 = 1, ..., 10) grazing. 

We now proceed to analyze the data of Table 3.5 under Model II since the 
group of observers and animals in the study can be regarded as random samples 
from the respective populations of observers and cattle and the results of the 
analysis are to be valid for the entire populations. Thus, the factors of observers 
and animals will both have random effects. The mathematical model would be 


Vy = +a; + Bj + ei; (i — eee eat Coeerene (0) 


where ju is the general mean, q; is the effect of the i-th observer, B; 1s the effect 


Two-Way Crossed Classification Without Interaction 155 


of the j-th cattle, and the e;;’s are random errors with 
a; ~ N(0, 02), pp N(0, Op); and éj; ~ N(0, a; ). 
In addition, the a;’s, B;’s, and e;;’s are mutually and completely independent. 


To perform the analysis of variance computations, we first obtain the row 
and column totals as 


y;, = 618, yo =O611, y3, = 637, ys = 608, ys. = 599; 
y, = 169, y2= 384, y3 = 364, y4= 146, ys = 305, 
yo6=> 409, V7= 419, yg = 334, yo = 362, Y10 = 181; 


and the grand total is 
y.. = 3,073. 
The other quantities needed in the calculations of the sums of squares are: 


y2 _ (3,073) 


se = 188,866.580, 
ab 50 


| = 2 I 2 2 2 

—)_y7 = —[(618 611)* +---+(599)"] = 188,947.900 
b 2! Tie y+ (O11) + +--+ 699)"] = 188,947.900, 
I : 2 I 2 2 

= ) y= 5 169) + (384)* + +--+ (181)*] = 208,211.400, 

a 


and 
a 


b 
\ > yp, = 34)? + (16) + --- + (33)? = 208,367. 
i=l j=l 


The resultant sums of squares are, therefore, given by 


SS; = S y95- — = 208,367 — 188,866.580 = 19,500.420, 


se 


2 
y 
SSj-=— 2 _ == — 188 947.900 — 188,866.580 = 81.320, 

oo ae i. Ob 


SS, =- aie = 208,211.400 — 188,866.580 = 19,344.820, 


156 The Analysis of Variance 


TABLE 3.6 
Analysis of Variance for the Grazing Data of Table 3.5 
Source of Degreesof Sum of Mean Expected 
Variation Freedom Squares Square MeanSquare F Value p-Value 
Observer 4 81.320 20.330 o2+1002 9.85  <0.001 
Animal 9 19,344.820 2,149.424 af + SoZ 1,041.89 <0.001 
Error 36 74.280 2.063 oa? 
Total 49 19,500.420 

and 


SSe = SSr — SS, — SSg = 19,500.420 — 81.320 — 19,344.820 = 74.280. 


Finally, the results of the analysis of variance calculations are summarized in 
Table 3.6. 

If we choose the level of significance a = 0.05, we find from Appendix 
Table V that F[4, 36;0.95] = 2.63 and F[9, 36;0.95] = 2.15. Comparing 
these values with the computed F' values given in Table 3.6, we may conclude 
that o2 > 0 and oj > 0, and there are strong significant differences among 
observers (p < 0.001) as well as among animals (p < 0.001). 

Furthermore, to evaluate the relative magnitude of the variance components 

2 


omen a7, and o2, we may obtain their unbiased estimates using the formulae 


(3.7.16), (3.7.17), and (3.7.18), respectively. Hence, we find that 


6? = 2.063, 
1 
33 = =(2149.424 — 2.063) = 429.472, 


1 
62 = 79 (20-330 — 2.063) = 1.827, 


and the best estimate of the total variance a? is given by 


6, =6, +6, 76, 
= 2.063 + 429.472 + 1.827, 
= 433.362. 


Now, the estimated proportions of the relative contribution of the variance 


Two-Way Crossed Classification Without Interaction 157 


components to the total variance are: 


622.063 

62 433.362 

5g = 429.472 
=~ = 0.991, 
62 433.362 

and 

621.827 

<< = —— = 0.004. 
62 433.362 


Thus, we note that about 99 percent of the variation in the observations is 

attributable to animals. This would probably be the most important finding in 

the experiment suggesting that the cattle vary vastly in their habits of grazing. 
To obtain a 95 percent confidence interval for o7, we have 


MSz = 2.063, 2[36, 0.025] = 21.34, and y2[36,0.975] = 54.44. 


Substituting these values in (3.8.2), the desired 95 percent confidence interval 
for a2 is given by 


E x 2.063 > 36x 2.063 


— 0.95 
5444. £21.34 


Or 


P[1.364 < 0? < 3.480] = 0.95. 


3.15 WORKED EXAMPLE FOR MODEL Ill 


In a plastics manufacturing factory, it is discovered that there is considerable 
variation in the breaking strength of the plastics produced by three different 
machines. The raw material is considered to be uniform and hence can be 
discarded as a possible source of variability. An experiment was performed to 
determine the effects of the machine and the operator on the breaking strength. 
Four operators were randomly selected and each assigned to a machine. The 
data are given in Table 3.7. 

It is desired to test whether there are significant differences among machines 
and operators. Since four operators were selected at random from a large pool of 
operators, who in turn were assigned to three specific machines, the experiment 
fits the assumptions of the mixed effects model. The mathematical model would 


158 The Analysis of Variance 


TABLE 3.7 
Breaking Strength of the 
Plastics (in lbs.) 


Operator 


Machine 1 2 3 4 


1 106 110 = 106 104 
107. 111-108 110 
3 109 «113112 111 


be 
yj = eta; + Bj +4; CEs nes Paey e—a ererere ee 9 


where yu is the general mean, a; is the effect of the i-th machine, 8; is the effect 
of the j-th operator, and e;;’s are random errors with 


3 
Y a; =0, Bj ~N(0,0%), and ej; ~ N(0,02). 
i=1 


It is further assumed that no interaction between the machine and the oper- 
ator is likely to exist and the 6,;’s and the e;;’s are mutually and completely 
independent. 

For the validity of the preceding assumptions, it 1s, of course, necessary 
that the experimenter must take appropriate measures to ensure that operators 
are randomly selected from the large pool of operators available. Moreover, 
systematic errors due to other factors should be avoided including possible 
sources of variation in the working conditions and in measuring the breaking 
strength. Random assignment of raw materials is also important. 

To perform the analysis of variance computations, we first obtain the row 
and column totals as 


yy. = 426, yo, = 436, y3. = 445; 
Yi = 322, y2= 334, V3 = 326, ¥4= 325; 


and the grand total is 


y. = 1,307. 


Two-Way Crossed Classification Without Interaction 159 


The other quantities needed in the calculations of the sums of squares are: 


2 (1,307)? 
Deas Fist 142, 354.083, 
ab 12 
le l 
, y= 1426)" + (436)* + (445)"] = 142,399.250, 


ly I 
= v= 3 1(322)" + (334)? + (326)? + (325)"] = 142,380.333, 
a 


and 


Q 


i=) Jel 


The resultant sums of squares are, therefore, given by 


a b 2 
2 a ae = 
SS; = y > ) Viz — Be = 142,437 — 142,354.083 = 82.917, 
i= j= 


ie y? 

SS, =- 2_ == — 142.399.250 — 142,354.083 = 45.167, 
Ab d ab 
hee yo 

SSz = - ey — —~ = 142,380.333 — 142,354.083 = 26.250, 
a ab 


and 
SSe = SSr — SS,4 — SSp = 82.917 — 45.167 — 26.250 = 11.500. 


Finally, the results of the analysis of variance calculations are summarized in 
Table 3.8. 

If we choose the level of significance a = 0.05, we find from Appendix 
Table V that F[2, 6;0.95] = 5.14. Since the calculated F value of 11.78 exceeds 
5.14 (p = 0.008), we may conclude that the machines differ significantly. 
Similarly, we find that F[3, 6;0.95] = 4.76, which is greater than the calculated 
value of 4.57 (p = 0.054) and so we do not reject the hypothesis of no significant 
operator effects. 

To determine which machines differ, we use Tukey’s and Scheffé’s proce- 
dures for paired comparisons. For the Tukey’s procedure, we find from Ap- 
pendix Table X that 


q [a, (a — 1)\(b — 1);1 — a] = g [3, 6; 0.95] = 4.34. 


160 The Analysis of Variance 


TABLE 3.8 

Analysis of Variance for the Breaking Strength Data of Table 3.7 

Source of Degreesof Sumof Mean Expected 

Variation Freedom Squares Square Mean Square FValue _p-Value 

4 3 

Machine 2 45.167 22.583 o7+-=—) a? 11.78 0.008 
= 

Operator 3 26.250 8.750 of + 303 4.57 0.054 

Error 6 11.500 1.917 o? 

Total 11 82.917 


Now, the pairwise differences of sample means for machines are compared to 


MS 1.917 
q{a, (a — I)(b— 1); 1—@],/ = — ere — 3.01. 


The three sample means for machines are 


¥,, = 106.50, 2, = 109.00, and j3, = 111.25, 
and there are three pairs of differences to be compared. Furthermore, 


v1. — ¥2.| = 2.50 < 3.01, 
ly. = y3, | = 4.75 > 3.01, 


and 
|¥2, — ¥3,| = 2.25 < 3.01. 


Hence, we may conclude that machines one and three are probably significantly 
different but not the others. 
For the Scheffé’s procedure, we find from Appendix Table V that 


S? = F[a—1,(a—1)\(b —1);1 —@] = F [2, 6;0.95] = 5.14. 


Furthermore, for the contrasts consisting of the differences between the two 
means, we have 


ois 
i) 


a 


Two-Way Crossed Classification Without Interaction 161 


So that the differences among the sample means are now compared to 


: 1 
S2(a — 1)MS¢ a = 5.14(2,(1.917)(5 } = 3.14. 


Here, again we conclude that machines one and three are probably significantly 
different but not the others. 

Furthermore, if the experimenter is interested in estimating the magnitudes 
of the variance components a7 and a, these may be estimated unbiasedly by 
using the formulae (3.7.22) and (3.7.23) as 


62 = 1.917 
and 


1 
= 3 (8.750 — 1.917) = 2.278. 


Finally, suppose we want to calculate the power of the test when the effect of 
one machine 1s higher than the other two by 3 lbs. Thus, we have, 


a3 =3+a,=3+4+ a), 


which gives a; = @2 = —1, and a3 = 2. 


So that 
4c, ,  f4{(-1? +(-1% + 2)7} 
a ee ei A Ie A i 91 
3 2 V 3x 1.917 


where an estimate of o? is obtained from MS ¢ = 1.917. Now, using the Pearson- 
Hartley chart, given in Appendix Chart I], with v; = 2, v. = 6,a = 0.05, and 
go = 2.04, the power is found to be about 0.66. 


3.16 WORKED EXAMPLE FOR MISSING VALUE ANALYSIS 


Consider the loss in weight due to wear testing data given in Table 3.3 and 
suppose that the observation corresponding to Material 1 and Position 1 is 
missing. We then have: 


a=4, b=3, y’ =2,633, y, =544, and y’, = 664. 


162 The Analysis of Variance 


Therefore, on substituting in the formula (3.10.1), we obtain 


3(664 4(544) — 2,633 
Ju = pA ated un ea ae = 255.8. 
(4-—1)(3 —- 1) 


This value is then entered in Table 3.3 in place of 241 and all sums of squares 
are computed as usual. The relevant computations are given in the following. 
The row and column totals, after substituting for the missing value, are: 


yi, = 799.8, yo, = 654, y3, = 738, y4, = 697. 
y1 =919.8, y2= 1,020, y3 = 949; 


and the grand total is 
y,, = 2,888.8. 
The other quantities needed in the calculations of the sums of squares are: 


y2 (2,888.8) 


— = ——__—. = 695, 430.453, 
ab 4x3 
I 2 I 2 2 2 2 
5 y= Bie2®) + (654)° + (738)° + (697)°] = 699,283.013, 
i=l 
i 


1 
7 1(919.8)° + (1,020)? + (949)?] = 696,758.260, 


Q | 
— 

nw N 
| 


and 
a b 

Y > yz, = (255.8)? + (270)? + ++ + (227)? = 701, 674.640. 
i=1 j=l 

The resultant sums of squares are, therefore, given by 


SSr = Es y?, — == = 701,674.640 — 695,430.453 = 6,244. 187, 
i= a 


Ss,.=— am, 2 _ =. — 699,283.013 — 695,430.453 = 3,852.560, 


SSp = 7 y?, — == = 696,758.260 — 695,430.453 = 1,327.807, 


Two-Way Crossed Classification Without Interaction 163 


TABLE 3.9 
Analysis of Variance for the Weight Loss Data of Table 3.3 with 
One Missing Value 


Source of Degreesof Sumof Mean Square Mean Square 


Variation Freedom Squares (Biased) (Unbiased) F Value p-Value 
Material 3 3,852.560 1,284.187 987.199 4.64 0.066 
Position Z 1,327.807 663.904 576.424 2.71 0.159 
Error 6-1 1,063.820 212.764 212.764 
Total 11-1 6,244.187 

and 


SSe = SS; — SS, — SSg = 6,244.187 — 3,852.560 — 1,327.807 
= 1,063.820. 


Finally, the results of the analysis of variance calculations are summarized in 
Table 3.9. Notice that the degrees of freedom in the total and error sums of 
square are reduced by one. 

To correct for the bias, from equation (3.10.5), the quantity to be subtracted 
from the material mean square is 


664 — (4 — 1)(255.8)]? 
[664 ~ (4 255.9)? _ 66 ogg 
4(4-1) 
This gives 1,284.187 — 296.988 = 987.199 for the correct mean square. Sim- 
larly, from equation (3.10.6), the quantity to be subtracted from the position 
mean square 1s 


2 

[544 — (3 — 1)(255.8)] _ 97.480. 
3(3 —1) 

This gives 663.904 — 87.480 = 576.424 for the correct mean square. The 

corrected mean squares, the variance ratios, and the associated p-values are also 

shown in Table 3.9. Note that the conclusions drawn in the Worked Example in 

Section 3.13 regarding differences in material and position effects are slightly 

affected. 

From equation (3.10.2), the estimated standard error of the sample mean of 

the material (with the missing value) is 


a 4 
SE (y1.) = 


z+ eta a (212.764) = 10.872. 


164 The Analysis of Variance 


Similarly, from equation (3.10.3), the estimated standard error of the difference 
between the means for materials 1 and 2, is 


si, 3 — [f24 4 
(1. — y2) = E 34-DG—-D 


(212.764) = 13.752. 

The same standard error applies for the comparison of the mean of material 
1 with means of materials 3 and 4. In contrast, the estimated error for the 
comparison of means of a pair materials with no missing values (i.e., 2 vs. 3, 2 


vs. 4, and 3 vs. 4), is given by ./2 (212.764) /3 = 11.910. 


3.17 USE OF STATISTICAL COMPUTING PACKAGES 


Two-way fixed effects analysis of variance with one observation per cell and 
no missing values can be performed using the SAS ANOVA procedure. The 
missing observations could be estimated and then the ANOVA procedure could 
be employed to perform the modified analysis as described in Section 3.10. For 
random and mixed model analysis of variance, the F tests remain unchanged, 
and no special analysis is required. The moment estimates of variance compo- 
nents can easily be computed from the entries of the analysis of variance table. 
For estimating variance components, using other methods, PROC MIXED or 
VARCOMP can be used. The details of SAS commands for executing these 
procedures are given in Section 11.1. 

Among SPSS procedures, either ANOVA, MANOVA, or GLM could be 
used, although ANOVA would be simpler. The analysis of data with missing 
values could be handled as indicated previoulsy. As before, for random and 
mixed effects analysis, no special tests are required. Further, SPSS Release 
7.5 now includes a VARCOMP procedure which provides for three methods 
for the estimation of variance components. For instructions regarding SPSS 
commands, see Section 11.2. 

In using the BMDP package, two programs suited for this model are 7D and 
2V if the analysis involves only fixed effects in the model. For the analysis 
involving random and mixed effects models 3V and 8V are recommended. 


3.18 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the applications of statistical packages to perform 
two-way analysis of variance with one observation per cell for the data sets 
employed in examples presented in Sections 3.13 through 3.15. Figures 3.1, 3.2, 
and 3.3 illustrate the program instructions and the output results for analyzing 
data in Tables 3.3, 3.5, and 3.7 using SAS ANOVA/GLM, SPSS ANOVA/GLM 
and BMDP 2V/8V procedures. The typical output provides the data format 
listed at the top, cell means, and the entries of the analysis of variance table. 
Note that in each case the results are the same as those provided using manual 
computations in Sections 3.13 through 3.15. | 


Two-Way Crossed Classification Without Interaction 165 


DATA WEARTEST; The SAS System 


INPUT MATERIAL POSITION Analysis of Variance Procedure 
WEIGHT; 


DATALINES; Dependent Variable: WEIGHT 

11 241 Sum of Mean 

a ae Source DF Squares Square — F Value Pr>F 

4 3 227 Model 5 4825.1667 965.0333 4.93 0.0388 
Error 1173.8333 195.6389 


PROC ANOVA; 


CLASSES MATERIAL POSITION; Corrected 5999.0000 

MODEL WEIGHT=MATERIAL Total 

1 POSITION; R-Square c.V. Root MSE WEIGHT Mean 
| RUN; 0.804328 5.840124 13.987 239.50 
CLASS LEVELS VALUES 2 

MATERIAL 4° 12 3 4 |Source DF Anova SS Mean Square F Value Pr >F 
1 POSITION 3 12 3 MATERIAL 3 3141.6667 1047.2222 5.35 0.0393 
NUMBER OF OBS. IN DATA POSITION 2 1683.5000 841.7500 4.30 0.0693 


SET=12 


(i) SAS application: SAS ANOVA instructions and output for the two-way fixed effects 
analysis of variance with one observation per cell. 


DATA LIST ANOVA (a,b) 

/MATERIAL 1 

POSITION 3 Unique Method 

WEIGHT 5-7. 
BEGIN DATA. Sum of df Mean 
11 241 Squares Square 
1 2 270 WEIGHT Main (Combined) 4825.167 5 965.033 


By, 0 Effects 
4 3 227 MATERIAL 3141.667 3 1047.222 
END DATA. POSITION 1683.500 2 841.750 
ANOVA WEIGHT BY Model 4825.167 5 965.033 
MATERIAL (1, 4) Residual 1173.833 6 195.639 
+ POSITION (1, 3) Total 5999.000 11 545.364 
/MAXORDER=NONE 
/STATISTICS=ALL. |a WEIGHT by MATERIAL, POSITION b All effects entered simultaneously 


(ii) SPSS application: SPSS ANOVA instructions and output for the two-way fixed 
effects analysis of variance with one observation per cell. 


/ INPUT FILE ='C:\SAHAI BMDP2V - ANALYSIS OF VARIANCE AND COVARIANCE WITH 
\TEXTO\EJES.TXT'. REPEATED MEASURES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT#=FREE. 

VARIABLES=3. ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 

NAMES=MAT, POS, THE TRIALS ARE REPRESENTED BY THE VARIABLES:WEIGHT 
WEIGHT. 

VARIABLE=MAT, POS. THE HIGHEST ORDER INTERACTION IN EACH TABLE HAS BEEN 

CODES (MAT) =1,2,3, 4. REMOVED FROM THE MODEL SINCE THERE IS ONE SUBJECT PER 

NAMES (MAT) =M1,M2, CELL 


M3,M4. 
CODES (POS) =1,2,3. SOURCE SUM OF D.F. MEAN 
NAMES (POS) =P1, P2,P3. SQUARES SQUARE 
DEPENDENT=WEIGHT. 


MEAN 688323.00000 688323.00000 3518.33 
MATERIAL 3141. 66667 1047.22222 5.35 
POSITION 1683.50000 841.75000 4.30 
ERROR 1173. 195.63889 


(iii) BMDP application: BMDP 2V instructions and output for the two-way fixed effects 
analysis of variance with one observation per cell. 


FIGURE 3.1 Program Instructions and Output for the Two-Way Fixed Effects 
Analysis of Variance with One Observation per Cell: Data on Loss in Weights Due 
to Wear Testing of Four Materials (Table 3.3) 


DATA ANMLGRAZ; The SAS System 

INPUT OBSERVER ANIMAL General Linear Models Procedure 

GRAZING; Dependent Variable: GRAZING 

DATALINES; Sum of Mean 

11 £34 Source DF Squares Square F Value Pr > F 
a, Sek Model 13 19426.140 1494.318 724.23 0.0001 
5 10 33 Error 36 74.280 2.063 

; Corrected 49 19500.420 

PROC GLM; Total 

CLASSES OBSERVER ANIMAL; R-Square c.V. Root MSE GRAZING Mean 
MODEL GRAZING=OBSERVER 0.996191 2.337180 1.4364 61.460 

ANIMAL? Source DF Type I SS Mean Square F Value Pr> F 
RANDOM OBSERVER ANIMAL; OBSERVER 81.320 20.330 9.85 0.0001 
RUN; ANIMAL 19344.820 2149.424 1041.72 0.0001 
} CLASS LEVELS VALUES Source Type III SS Mean Square F Value Pr > F 
OBSERVER 5 12345 OBSERVER 81.320 20.330 9.85 0.0001 
ANIMAL 10 12345 ANIMAL 19344.820 2149.424 1041.72 0.0001 
678 9 10 Source Type III Expected Mean Square 
| NUMBER OF OBS. IN DATA OBSERVER Var (Error) + 10 Var (OBSERVER) 

ANIMAL Var(Error) + 5 Var(ANIMAL) 


(i) SAS Application: SAS GLM instructions and output for the two-way random effects 
analysis of variance with one observation per cell. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: GRAZING 
/OBSERVER 1 

ANIMAL 3-4 Source Type III Ss df Mean Square F Sig. 
GRAZING 6-7. OBSERVER Hypothesis 81.320 4 20.330 9.853 .000 
BEGIN DATA. Error 74.280 36 2.063 (a) 

11 34 ANIMAL Hypothesis 19344.820 9 2149.424 1041.724 .000 
12 76 Error 74.280 36 2.063 (a) 

13 #75 a MS(Error) 


5 10 33 Expected Mean Squares (a,b) 

END DATA. Variance Component 
GLM GRAZING BY Source Var (OBSERVER) Var (ANIMAL) Var (Error) 
OBSERVER ANIMAL | OBSERVER 10.000 -000 1.000 

/DESIGN ANIMAL 000 5.000 1.000 

OBSERVER Error -000 - 000 1.000 

ANIMAL a For each source, the expected mean square equals the sum of the 
/RANDOM coefficients in the cells times the variance components, plus a 
OBSERVER quadratic term involving effects in the Quadratic Term cell. 
ANIMAL. b Expected Mean Squares are based on the Type III Sums of Squares. 


ae 


(ii) SPSS Application: SPSS GLM instructions and output for the two-way random 
effects analysis of variance with one observation per cell. 


BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
- EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 


* PILE='C: \SAHAI 
\TEXTO\EJE6. TXT". 
FORMAT=FREE. 

VARIABLES=10. 

/VARIABLE NAMES=A1,...,A10. 


ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 


j /DESIGN NAMES=OBSR, ANIM. SOURCE ERROR SUM OF D.F. MEAN F TAIL 
LEVELS=5, 10. TERM SQUARES SQUARE PROB. 
RANDOM=OBSR, ANIM. 1 MEAN 188866.5800 1 188866.580 
MODEL='0,A". 2 OBSERVER OA 81.3200 4 20.330 9.85 0.0000 
| 3 ANIMAL OA  19344.8200 9 2149.424 1041.72 0.0000 
134 76 75 31 61 82 82 67 72 |4 OA 74.2800 36 2.063 
138 
Vee Ogee te . 7 « a . SOURCE EXPECTED MEAN ESTIMATES OF 
33 77 70 27 59 81 82 67 70 SQUARE VARIANCE COMPONENTS 
33 1 MEAN 50(1) +10 (2) +5 (3)+(4) 3733.97778 
ANALYSIS OF VARIANCE DESIGN 2 OBSERVER 10(2) + (4) 1.82667 
INDEX OBSR ANIM |3 ANIMAL 5(3) + (4) 429.47222 
NUMBER OF LEVELS 5 10 4 OA (4) 2.06333 


POPULATION SIZE INF INF 


MODEL 0, A 


(111) BMDP Application: BMDP 8V instructions and output for the two-way random 
effects analysis of variance with one observation per cell. 


FIGURE 3.2 Program Instructions and Output for the Two-Way Random Effects 
Analysis of Variance with One Observation per Cell: Data on Number of Minutes 
Observed in Grazing (Table 3.5). 


| DATA BREAKSTR; The SAS System 

INPUT MACHINE OPERATOR General Linear Models Procedure 

BREAKING; Dependent Variable: BREAKING 

DATALINES; Sum of Mean 

1 1 106 Source Squares Square F Value Pr >F 
1 2 110 Model 71.4167 14.2833 7.45 0.0149 
oe ok Error 11.5000 1.9167 

3 4111 Corrected 82.9167 

; Total 

PROC GLM; R-Square C.V. Root MSE BREAKING Mean 
CLASSES MACHINE OPERATOR; 0.861307 1.271098 1.3844 108.92 
MODEL BREAKING=MACHINE Source DF Type I SS Mean Square F Value Pr >F 
OPERATOR; MACHINE 2 45.1667 22.5833 11.78 0.0084 
RANDOM OPERATOR; OPERATOR 3 26.2500 8.7500 4.57 0.0543 
RUN; Source DF Type III SS Mean Square F Value Pr >F 
CLASS LEVELS VALUES MACHINE 2 45.1667 22.5833 11.78 0.0084 
MACHINE 3 12 3 OPERATOR 3 26.2500 8.7500 4.57 0.0543 
OPERATOR 4 12 3 4 | Source Type III Expected Mean Square 

NUMBER OF OBS. IN DATA MACHINE Var(Error) + Q(MACHINE) 

SET=12 OPERATOR Var (Error) + 3 Var(OPERATOR) 


(i) SAS application: SAS GLM instructions and output for the two-way mixed effects 
analysis of variance with one observation per cell. 


Tests of Between-Subjects Effects Dependent Variable: BREAKING 
/MACHINE 1 
OPERETOR 3 Source Type III SS af Mean Square FE Sig. 
BREAKING 5-7. MACHINE Hypothesis 45.167 2 22.583 11.783 .008 
BEGIN DATA. Error 11.500 6 1.917 (a) 
11 106 OPERATOR Hypothesis 26.250 3 8.750 4.565 .054 
1 2 110 Error 11.500 6 1.917 (a) © 
13 106 a MS(ERROR) 


3 4 111 Expected Mean Squares (a,b) 
END DATA. Variance Component 
| GLM BREAKING BY | Source Var (OPERATOR) Var (Error) Quadratic Term 
MACHINE .000 1.000 Machine 
OPERATOR 3.000 1.000 
Error .000 1.000 
a For each source, the expected mean square equals the sum of the | 
OPERETOR coefficients in the cells times the variance components, plus af 
| /RANDOM quadratic term involving effects in the Quadratic Term cell. f 
OPERETOR™ "7 b Expécted Meari Squares are based on the Typé III~StimsS of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the two-way mixed effects 
analysis of variance with one observation per cell. 


BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
- EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 


FILE='C: \SAHAI 
\TEXTO\EJE7.TXT'. 
FORMAT=FREE. 

| VARIABLES=4. 
/VARIABLE NAMES=01,...,04. 


| /INPUT 


ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 


/ DESIGN NAMES=M, 0. SOURCE ERROR SUM OF D.F. MEAN FE 

LEVELS=3, 4. TERM SQUARES SQUARE 

RANDOM=O. 1 MEAN 142354.0833 1 142354.083 
MODEL='M, O°. 2 MACHINE MO 45.1667 2 22.583 11.78 0.0084 
y /END 3 OPERATOR MO 26.2500 3 8.750 4.57 0.0543 
7106 110 106 104 4 MO 11.5000 6 1.917 


1107 11} 108 110 


1109 113 112 111 SOURCE EXPECTED MEAN ESTIMATES OF 
ANALYSIS OF VARIANCE DESIGN SQUARE VARIANCE COMPONENTS 

| INDEX M fe) 1 MEAN 12 (1) +4 (2) +3(3)+(4) 11860.38889 

| NUMBER OF LEVELS 3 4 2 MACHINE 4(2)+(4) 5.16667 

1} POPULATION SIZE INF INF 3 OPERATOR 3(3)+(4) 2.27778 
| MODEL M, O 4 MO (4) 1.91667 | 


Fe ge a 
(iii) BMDP application: BMDP 8V instructions and output for the two-way mixed 
effects analysis of variance with one observation per cell. 


FIGURE 3.3. Program Instructions and Output for the Two-Way Mixed Effects 
Analysis of Variance with One Observation per Cell: Data on Breaking Strength 
of the Plastics (Table 3.7) 


168 The Analysis of Variance 


3.19 EFFECTS OF VIOLATIONS OF ASSUMPTIONS 
OF THE MODEL 


For the two-way crossed model (3.1.1), with a single observation and no in- 
teraction, Welch (1937) and Pitman (1938) compared the moments of the beta 
Statistic, related to the usual F statistic, both under the assumption of normal- 
ity and under permutation theory. There was close agreement between the two 
results which lends support to the robust nature of the F test as in the case 
of one-way classification. For a further discussion of this point, the reader is 
referred to Hinkelman and Kempthorne (1994, Chapter 8). 

The problem of the effect of unequal error variances on the inferences of the 
model (3.1.1) was studied by Box (1954b). The results showed that for minor 
departures from homoscedasticity, the effects are not large. If the variances 
differ row-wise but are constant over columns, then the actual a-level for the 
test of row effects is slightly greater than the nominal value. For the test of the 
column effects, the reverse is true. 

Box (1954b) also studied the effect of a first-order serial correlation be- 
tween rows within columns. It was found that treatment (row) comparisons are 
not appreciably affected by serial correlation between treatment measurements 
within a block (column). However, serial correlations among the measurements 
on each treatment can seriously affect the validity of treatment comparisons. 

For a discussion of effects of violation of assumptions under Models II and 
III, see Section 4.19. 


EXERCISES 


1. The drained weights of frozen oranges were measured for various 
compositions and concentrations of a drink. The original weights in 
each case were the same. Any observed differences in drained weights 
can thus be attributed to differences in concentration or composition 
of the drink. The relevant data on weights (0z.) are as follows. 


Composition 
Concentration (%) Ci C> C3 C4 
20 21.52 21.32 22.19 22.19 
30 2232. -2152. (22:32. 23215 
40 22.56 23.12 22.42 21.52 
50 23.31 22.15 21.32 22.16 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Does the concentration of the drink have a significant effect on 
the drained weight? Use a = 0.05. 

(d) Does the composition of the drink have a significant effect on 
the drained weight? Use a = 0.05. 


Two-Way Crossed Classification Without Interaction 169 


(e) If there are significant differences in drained weights due to drink 
concentration, use a suitable multiple comparison method to de- 
termine which concentrations differ. Use a = 0.01. 

(f) Same as part (e) but for drink composition. 

2. Three levels of fertilizer in combination with two levels of irrigation 
were employed in a field experiment. The six treatment combinations 
were randomly assigned to plots. The relevant data on yields are as 
follows. 


Level of Fertilizer 
Level of 


Irrigation High Medium Low 


Yes 380 340 305 
No 330 360 340 


(a) Describe the appropriate mathematical model and the assump- 
tions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Are there significant differences among the levels of the fertil- 
izer? Use a = 0.05. 

(d) Obtain point and interval estimates of the overall difference in 
yield due to irrigation. 

(e) Obtain point and interval estimates of the mean difference in 
yield between high and low fertilizer levels. 

3. An experiment was conducted to assess the effect of four brands of 
cutting fluids on the abrasive wear of four types of cutting tools. The 
measure of wear was reported in terms of the logarithm of loss of tool 
flank weight (in grams times 100) in a 1-hour test run. The relevant 
results are summarized as follows. 


Cutting Fluid Brand 


Cutting Tool CF1 CF2 CF3 CF4 


CT1 1.171 1.057 1.061 1.011 
CT2 0.705 0.612 0.631 0.598 
CT3 0.538 0.418 0.457 0.412 
CT4 0.414 1.308 1.371 1.251 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Do the fluid brands have a significant effect on the measure of 
abrasive wear? Use a = 0.05. 

(d) Does the type of cutting tool have a significant effect on the 
measure of abrasive wear? Use a = 0.05. 

(e) If there are significant differences in the measure of abrasive 
wear due to the fluid brand, use a suitable multiple comparison 
method to determine which fluid brands differ. Use a = 0.01. 


170 The Analysis of Variance 


(f) Same as part (e) but for the type of cutting tool. 

4. During the manufacturing process of a certain component, its break- 
ing strength was measured for three operating temperatures and for 
each temperature there was influence of seven furnace pressures. The 
relevant data in certain standard units are given as follows. 


Pressure 
Temperature PI P2 P3 P4 P5 P6 P7 
T1 0.803 0.836 1.303 1.276 1.161 1.054 1.307 
T2 0.705 0.630 ~~ 1.005 1.062 0.616 0.803 0.618 
T3 1.321 0.815 0.771 1110 O.710 ~~ 1.022 0.717 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Do the operating temperatures have a significant effect on the 
breaking strength? Use a = 0.05. 

(d) Do the furnace pressures have a significant effect on the breaking 
strength? Use a = 0.05. 

(e) If there are significant differences in breaking strength due to 
temperatures, use a suitable multiple comparison method to de- 
termine which temperatures differ. Use a = 0.01. 

(f) Same as part (e) but for the furnace pressure. 

5. A university computing department manages four resource centers 
on the campus. Each center houses one timesharing terminal and 
two types of personal computers. During a given week, the numbers 
of hours a certain type of computing machine was being used were 
recorded and the relevant data are as follows. 


Resource Center 


Equipment 1 2 4 4 
Time-Sharing Terminal 70 70 50 60 
Apple Computer 40 40 20 £40 
IBM Computer 30 30 610 = 630 


(a) Describe the mathematical model you will employ to analyze 
the effects of resource center location and the type of computing 
machine. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform appropriate F tests to determine whether the two factors 
have main effects. Use a = 0.05. 

6. A factorial experiment was performed to study the effect of the level 
of pressure and the level of temperature on the compressive strength 
of thermoplastics. The relevant data in certain standard units are given 


Two-Way Crossed Classification Without Interaction 


as follows. 
Temperature (°F) 
Pressure (Ib/in.2) 250 260 270 
120 8.00 10.57 8.30 
130 8.01 9.40 8.86 
140 7.72 10.30 8.32 
150 8.14 9.73 8.01 
(a) Describe the mathematical model and the assumptions for the 


(b) 
(c) 


(d) 


(e) 


(f) 
(g) 


experiment considering both factors to be fixed. 

Analyze the data and report the analysis of variance table. 
Does the level of temperature have a significant effect on the 
compressive strength? Use a = 0.05. 

Does the level of pressure have a significant effect on the com- 
pressive strength? Use a = 0.05. 

If there are significant differences in the compressive strength due 
to the temperature, use a suitable multiple comparison method 
to determine which temperatures differ. Use a = 0.01. 

Same as part (e) but for the pressure. 

Suppose the temperature and pressure levels are selected at ran- 
dom. Assuming that there is no interaction between pressure 
and temperature, state the analysis of variance model and report 
appropriate conclusions. How do your conclusions differ from 
those obtained in parts (c) and (d). 


171 


7. Astudy was conducted to determine the effect of the size of a group on 
the results of a brainstorming session. Three different types of com- 
pany executives were used, one for each group size. Each group was 
assigned a problem and was given an hour to generate ideas. The vari- 
able of interest was the number of new ideas proposed. The relevant 
data are given as follows. 


(a) 


(b) 
(c) 


Size of Group 
Type of Group 2 3 4 #5 
Sales executives 22 30 38 34 


Advertising executives 19 25 32 31 
Marketing executives 16 20 26 28 


Describe the mathematical model you will employ to analyze 
the effects of group size and the type of group on the number of 
new ideas being proposed. 

Perform appropriate F tests to determine whether the factors 
have any main effects. Use w = 0.05. 

If there are significant differences between the types of groups, 
use a suitable multiple comparison method to determine which 
group types differ. Use a = 0.01. 


172 The Analysis of Variance 


8. Consider a two-factor experiment designed to investigate the break- 
ing strength of a bond of pieces of material. There are 5 ingots of a 
composition material that are used with one of the three metals as the 
bonding agent. The data on amount of pressure required to break a 
bond from an ingot that uses one of the metals as the bonding agent 
are given as follows. 


Type of Metal 


Ingot Copper tron Nickel 


1 83.3 83.0 78.1 
2 77.5 79.9 78.6 
3 85.6 93.8 87.1 
4 78.4 89.2 83.8 
5 84.3 85.3 84.2 


(a) Describe the mathematical model and the assumptions for the 
experiment. It is assumed that metals are fixed and ingots are 
random. 

(b) Analyze the data and report the analysis of variance table. 

(c) Do the metals have a significant effect on the breaking strength? 


Use a = 0.05. 
(d) Do the ingots have a significant effect on the breaking strength? 
Use a = 0.05. 


(e) If there are significant differences in the breaking strength due to 
metals, use a suitable multiple comparison method to determine 
which metals differ. Use a = 0.01. 

(f) Obtain point and interval estimates of the variance components 
associated with the assumptions of the model given in part (a). 

9. Mosteller and Tukey (1977, p. 503) reported data from a study where 

six experimenters measured the specific heat of water at various tem- 

peratures. The interest lies in investigating the reliability of the mea- 

surements and in determining an accurate estimate of the specific heat. 

The data are given as follows. 


Temperature (°C) 


Investigator 5 10 15 20 25 30 

Liidin 1.0027 1.0010 1.0000 0.9994 0.9993 0.9996 
Dieterici 1.0050 1.0021 1.0000 0.9987 0.9983 0.9984 
Bonsfreld 1.0039 1.0016 1.0000 0.9991 0.9989 0.9990 
Ronland 1.0054 1.0019 1.0000 0.9979 0.9972 0.9969 
Bartollis 1.0041 1.0017 1.0000 0.9994 1.0000 1.0016 
Janke 1.0040 1.0016 1.0000 0.9991 0.9987 0.9988 


Source: Mosteller and Tukey (1977, p. 503). Used with permission. 


Two-Way Crossed Classification Without Interaction 


(a) 
(b) 
(c) 
(d) 
(€) 


(f) 
(g) 


Describe the mathematical model and the assumptions for the 
experiment. Would you use Model I, Model II, or Model IIT? 
Explain. 

Analyze the data and report the analysis of variance table. 
Does the level of the temperature have a significant effect on the 
measurement of specific heat. Use a = 0.05. 

Does the investigator have a significant effect on the measure- 
ment of specific heat. Use a = 0.05. 

If there are significant differences in specific heats due to temper- 
ature, use a Suitable multiple comparison method to determine 
which levels of the temperature differ. Use a = 0.01. 

Same as in part (e) but for the investigator. 

If you assumed the investigator to be a random factor, determine 
the point and interval estimates of the variance components of 
the model. 


173 


10. Weekes (1983, Table 1.1) reported data of Michelson and Morley on 
the speed of light. The data came from 5 experiments, each consisting 
of 20 consecutive runs. The results are given below where reported 
measurement is the speed of light in suitable units. 


Experiment 
Run 1 2 3 4 5 
1 850 960 880 890 890 
2 740 940 880 810 840 
3 900 960 880 810 780 
4 1070 940 860 820 810 
5 930 880 720 800 760 
6 850 800 720 770 810 
7 950 850 620 760 790 
8 980 880 860 740 810 
9 980 900 970 750 820 
10 880 840 950 760 850 
11 1000 830 880 910 870 
12 980 790 910 920 870 
13 930 810 850 890 810 
14 650 880 870 860 740 
15 760 880 840 880 810 
16 810 830 840 720 940 
17 1000 800 850 840 950 
18 1000 790 840 850 800 
19 960 760 840 850 810 
20 960 800 840 780 870 


Source: Weekes (1983, Table 1.1). Used with permission. 


174 The Analysis of Variance 


(a) Describe the mathematical model and the assumptions for the 
experiment. Would you use Model I, Model II, or Model III? 
Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Does the experiment have a significant effect on the measurement 
of the speed of light? Use a = 0.05. 

(d) Does the run have a significant effect on the measurement of the 
speed of light? Use a = 0.05. 

(e) Assuming that the experiment and run effects are random, esti- 
mate the variance components of the model and determine their 
relative importance. 

11. Berry (1987) reported data from an experiment designed to investigate 
the performance of different types of electrodes. Five different types of 
electrodes were applied to the arms of 16 subjects and the readings for 
resistance were taken. The data are given below where the measures 
of resistance are given in the original units of kilohms. 


Electrode Type 


Subject 1 2 3 4 5 
1 500 400 98 200 250 
2 600 600 600 75 310 
3 250 370 220 250 220 
4 72 140 240 33 54 
5 135 300 450 430 70 
6 27 84 135 190 180 
7 100 50 82 73 78 
8 105 180 32 58 32 
9 90 180 220 34 64 

10 200 290 320 280 135 

11 15 45 75 88 80 

12 160 200 300 300 220 

13 250 400 50 50 92 

14 170 310 230 20 150 

15 66 1000 1050 280 220 

16 107 48 26 45 51 


Source: Berry (1987). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. Would you use Model I, Model II, or Model III? 
Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Does the electrode type have a significant effect on the measures 
of resistance? Use aw = 0.05. 


Two-Way Crossed Classification Without Interaction 175 


(d) Does the subject have a significant effect on the measures of 
resistance? Use a = 0.05. 

(e) If there are significant differences in resistance due to electrode 
type and it 1s considered to be a fixed effect, use a suitable multi- 
ple comparison method to determine which electrode types dif- 
fer. Use aw = 0.01. 

(f) If you assumed any of the effects to be random, determine the 
point and interval estimates of the variance components of the 
model. 

(g) It is found that the measures of resistance have a skewed dis- 
tribution to the right. Make a logarithmic transformation on the 
data and repeat the analyses carried out in parts (b) through (f). 


Two-Way Crossed 
Classification 
with Interaction 


4.0 PREVIEW 


Suppose that we relax the requirement of model (3.1.1) that there be exactly one 
observation in each of the a x b cells of the two-way layout. The model remains 
the same except that we could now use y;;, to designate the k-th observation 
at the i-th level of A and the j-th level of B, that is, in the (i, 7)-th cell. We 
now suppose that there are n(n > 1) observations in each cell. With n = 1, 
the model (3.1.1) will be a special case of the model being considered here. 
With an arbitrary integer value of n, the analysis of variance will be a simple 
extension of that described in Chapter 3. However, an important and somewhat 
restrictive implication of the simple additive model discussed in Chapter III is 
that the value of the difference between the mean responses at two levels of A 
is the same at each level of B. However, in many cases, this simple additive 
model may not be appropriate. The failure of the differences between the mean 
responses at the different levels of A to remain constant over the different levels 
of B is attributed to interaction between the two factors. Having more than one 
observation per cell allows a researcher to investigate the main effects of both 
factors and their interaction. In this chapter, we study the model involving two 
factors with interaction terms. 


4.1 MATHEMATICAL MODEL 


Consider two factors A and B having a and b levels, respectively, and let there 
be n observations at each combination of levels of A and B (..e., n observations 
in each of the a x b cells). Let y;;, be the k-th observation at the i-th level of 
A and the j-th level of B. The data involving a total of N = abn scores yj;x’s 
can then be schematically represented as in Table 4.1. 

The notations employed here are a straightforward extension of the notations 
for the preceding chapter. A dot in the subscript indicates aggregation or total 
over the variable represented by the index, and a dot and a bar represent the 
corresponding mean. Thus, the sum of the observations corresponding to the i-th 


I. @GAKAS ~~ ] Ths Nap adnan pdt Vig Aaee 
H. Sahai et al., The Analysis of Variance 177 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


TABLE 4.1 
Data for a Two-Way Crossed Classification with n Replications per Cell 
Factor B 
B, Bo B; wae vee B; Lee Bp 
Ai Vill, VII2,--+> Yiln Y121> Y1225-+-s Yl2n Y131, Y1325 +++ Yi3n on Yijls Vij2.--+5Vijn vss Yibl> Ylb2> +++ Yibn 
A2 Y2115 Y2125--+s Y2In 221; Y222;-++s Y22n Y231 Y232,+--+s Y23n vse Y2j1, Y2j2.--+s Y2jn ute Y2b1, Y2b2; +--+» Y2bn 
A3 Y311; Y312,--+5 Y3ln 321, Y322, +--+ Y32n Y331; 332, +++ Y33n ue y3j1; Y3j2.+-+s Y3jn uct Y3b1; Y3b2,--+s Y3bn 
Factor A 
Aj Yills Yil2.-++s Yiln Yi21, Vi22,-++s Yi2n Vi31s Yi32s ++ Vian ves Yijls Vij2s-++s Yijn vce Yibl, Vib2s +++ Yibn 
Aa Valls Yal2,+++s Yaln Ya21, Ya22,+++> Ya2n Ya3ls Ya32.+++> Ya3n vt Yajls Yaj2s+-++s Yajn mts Yabls Yab2; +--+» Yabn 


8ZL 


aoueHeA jo sisAjeuy aut 


Two-Way Crossed Classification with Interaction 179 


level of factor A and the j-th level of factor B is 


Vij. = > Yijk 
k=1 


The corresponding mean is 


> Yijk 
l 


— _ ij. k= 
ij. = —_ = —, 
n n 


The total of all the observations for the i-th level of factor A is 
b n 
Yi. = > > Vijk> 
j=l k=1 


and the corresponding mean is 


Similarly, for the j-th level of factor B, the sum of all the observations and 
the corresponding mean are denoted by 


and 


Finally, the sum of all the observations, the grand total, is 
a b n 
Y= > >> >> Vijk> 


i=] j=l k=1 


and the grand or overall mean is 


180 The Analysis of Variance 


The analysis of variance model for this type of experimental layout is given 


as 
i=1,2,...,a 

Vijk = M+; + Bi + (AB) +e YJ =1,2,...,0 (4.1.1) 
k=1,2,...,n, 


where —0o < pt < 00 Is the overall mean, a; is the effect due to the i-th level 
of factor A, B; is the effect due to the j-th level of factor B, (@B);; 1s the 
interaction effect representing the departure of the mean of the observations in 
the (i, j)-th cell, denoted by j1;;, from the sum of the first three terms of (4.1.1), 
and é;;, is the customary error term accounting for the random variation from 
cell to cell. The terms a; and 6; are known as main effects. They are average 
effects corresponding to each level of factor A and each level of factor B. The 
term (@);; is called an interaction effect. If the levels of factor A and factor B 
behave in a strictly additive manner, that is, if a level of factor A contributes 
a certain amount to the average yield, irrespective of the level of factor B, we 
say that the (@f);;’s are all zero. On the other hand, if a level of factor A, say, 
3, increases yield more with, say, level | than with level 2 of factor B, we say 
that (@B)3; is positive and (@B)32 1s negative. The model (4.1.1) states that an 
observation y,;;, consists of these components: 


(1) the overall mean jp, 
(ii) the main effect a; for factor A at the i-th level, 
(ii1) the main effect B; for factor B at the j-th level, 
(iv) the interaction effect (@B);; when factor A is at the i-th level and factor 
B is at the j-th level, and 
(v) an error or residual term e;;, which is the deviation of a particular 
observation from the cell mean /;;. 


Remark: Two-way crossed classifications are widely used in many areas of scientific 
research and applications. Some examples are as follows: 


(a) A city has a sources of a pollutant, n samples are taken from each source, and 
are sent to b laboratories for analysis of its chemical composition. 

(b) An organism has a species, n females are taken from each species, and each 
one is used in b experiments in order to measure variation in progeny. 

(c) An industrial production involves a machines, b workers, and n samples of the 
product are taken from each machine x worker combination. 


4.2 ASSUMPTIONS OF THE MODEL 


As in the case of model (3.1.1), the assumptions of the model (4.1.1) can be 
summarized as follows: 


(i) The errors e;;,’s are randomly distributed with mean zero and common 
variance o7. 


Two-Way Crossed Classification with Interaction 181 


(ii) The errors associated with any pair of observations are uncorrelated. 
(iii) Under Model I, the @;’s, B;’s, and (a@B);;’s are assumed to be fixed 
constants subject to the constraints 


a 


a b b 
> % = 9B) = DB); = DB); = 0. 
i=1 j=l j=l 


i=] 


(iv) Under Model II, the w;’s, B;’s, and (wf); ;’s are assumed to be randomly 


2 2 


distributed with zero means and variances o;, ore and Cup respec- 


tively. Furthermore, the a;’s, B;’s, and (@B);;’s are mutually and com- 
pletely uncorrelated. In this case, o2, Op» Oop and 0? are the variance 
components of the model (4.1.1) and the inferences are sought about 
them. 

(v) Although the distribution properties for Models I and II were fairly 
straightforward to enumerate, this is not the case for Model III. Gen- 
erally, if any element in an interaction term is considered random, it 
may be appropriate to assume that the interaction term has a Model II 
effect. However, in that case, we must assume the corresponding distri- 
bution properties of the interaction terms. The proper error term used 
to test certain hypotheses and to construct certain confidence intervals 


will depend on the distribution properties being assumed. 


There are several types of distribution properties that have been proposed as 
realistic for various experimental situations and their full discussion is beyond 
the scope of this volume. The interested reader is referred to the works of Wilk 
(1955), Wilk and Kemthorne (1955, 1956), Scheffé (1956a,b), and Harville 
(1978). The articles by Hocking (1973) and Sahai (1988) provide an excel- 
lent summary of various mixed models. We assume the following distribution 
properties and an analysis of variance for this situation is presented in succeed- 
ing sections. If an experimenter desires distribution properties other than those 
given here, tests and confidence intervals must be modified accordingly. One 
such alternate mixed model is discussed briefly in Section 4.19. 

We assume that @;’s are constants subject to the constraint )-7_, a; = 0; B;’s 
are uncorrelated random variables with mean zero and variance Op: (af); ;'s are 
random variables with mean zero and variance [(a — 1)/ aloig and subject to the 
constraints }°;_,(@B);; = 0 for all j.2 This introduces dependence between 


' Some statisticians have expressed concern over the fact that the interaction terms (a8); ;’s are 
assumed to be uncorrelated to those of a;’s and B;'’s. However, the assumption is consistent 
with the results from the finite population models that define the interaction to be a function of 
the main effects (see, e.g., Cornfield and Tukey (1956); Scheffé (1959, Section 7.4)). 

2 Since the a;’s are assumed to be fixed subject to the constraint that }~"_, a; = 0, it is felt that it 
is reasonable to assume that the summation of the interaction terms, over all the levels of factor 
A within a given level of factor B, should be equal to zero. 


182 The Analysis of Variance 


certain interaction terms at different levels of the fixed factor. In fact, one can 
show that (Graybill (1961, pp. 396-397))° 


I 2 . “fy 
El(aB);; (@B)ryl=4 a | a (4.2.1) 
0, pai pei'. 


Note that in this model the variance of (@B);; 1s defined as [(a — 1)/a)oZ, 
instead of O58 in order to simplify the expressions for expected mean squares. 
Furthermore, the 6;’s, (@B);;’s, and e;;,’s are uncorrelated with each other, and 
€;jx’S have mean zero and variance oa. 

The objective in this model is to test ‘the hypotheses: oz = =0,02 op = 9, Mi Sa = 0; 
and to find point and interval estimates for the variance components 07, a7 " ; OR: ; 
fixed effects a;’s and their contrasts. 


Remarks: (i) Under Model I, the assumptions that }*"_,(@B);; = 0 = jai (aB)i; 
can be made without any loss of generality because, by definition, (a@B);; 1s a differential 
effect and if the sum, say, over i, were equal to some non-zero constant c, we could 
replace 6; with 6; — c/a and the resultant (@f);; will then sum to zero. Similarly, the 
assumptions that )-_, a; = 0 = ~~ j<1 Bj can be made without any loss of generality 
(see Remark at the end of Section 2.2). 

(ii) As indicated in the Remark of Section 3.2, under the assumptions of the random 
effects model in (4.1.1), there are three intraclass correlations defined as follows: The 
correlation between the observations within the same level of factor A; within the 
same level of factor B; and between the observations within the same cell. Denoting 
these correlations by Pr Pg, and Pap, we have Pa. = = 0, 1 Ce + Cup + O5 + 62), pp = 

o3/(o; +05, +0 +o), and Pop = Onp/(o; + Og, +05 +6; 2) Under the assumptions 
of the mixed model in (4.1.1), the correlation between the observations within the same 
cell is given by pug = Onp/(O; + Oxp + Op): 


4.3. PARTITION OF THE TOTAL SUM OF SQUARES 


To partition the total sum of squares, we start with the identity 


ijk — ¥.. = Vi. — VDA OG. — YD + Oj. — Yi. — 9G. FY) + Oijk — ij), 


and square and sum over i,j, and k to yield: 


b 
\- You _ y.. y 
k= 


i=1 j=l 


a 


3 This implies that the interaction terms are correlated with a covariance equal to —(1/ a)o? op and 
that the covariance decreases as the number of levels of the factor A increases. 


Two-Way Crossed Classification with Interaction 183 


a b 


=o rw, —VI+FO.- 9.) (4.3.1) 


i=] j=l k= 
+ (Vij. — Vi. — VG +S: D+ Oye Sud 


-~y (Vi.. — 9. +e do. 


i=] j=l k=1 i=] j=l k= 


a b n a b n 


+) (Vij. — Fi. — HG FLY +>" (vijk — Vix.) 


i=] j=] k=1 i=] j=l k=] 


= bn (yi. — Y.. y +an Y0,- y..) 
i=] j= 


nD d Bij. — He — IAT + 


=| j= i=1 j 


Q 
> 
Q 


b n 
(ijk — Vij), 


1 k=1 
(4.3.2) 


where 


ij. = Yi. = 
) n- bn 
a n a b n 
y > Yijk ) } Yijk 
_ i=l k=1 _ i=] j=1 k=1 
yj. = ———_.,, and jy. = 
an abn 


The identity (4.3.1) is valid since all cross-product terms are equal to zero. 

The terms to the right of (4.3.1) are denoted by SS,4, SSg, SSaz, and SSz 
in respective order and measure the variation due to the a@;’s, B;’s, (@B)j;;’s, 
and e;;,’s, respectively. The identity states that we have partitioned the total 
variation SS; into the following components: 


(i) SS,4, called the A-factor sum of squares, representing the variation in 

the y;;, due to the A-factor effects; 

(ii) SSzg, called the B-factor sum of squares, representing variation in the 
yijk due to the B-factor effects; 

(iii) SS,g, called the interaction sum of squares, representing variation in 
the y,;, due to the interaction effects; and 

(iv) SSz, called the error sum of squares, representing variation in the yj jx 
after removing A-factor effects, B-factor effects, and interaction effects. 


Remark: In a two-way crossed classification, one can compute ab separate variances 
corresponding to each cell as )-)_, (yijx—Jij.)”, which can then be tested for homogeneity 
of variances (see Section 2.22). 


184 The Analysis of Variance 


4.4 MEAN SQUARES AND THEIR EXPECTATIONS 


Mean squares are obtained in the usual way by dividing the sums of squares by 
their corresponding degrees of freedom. For SS,4 and SSzg terms, we have the 
condition that 


b 
Dd Gi. — 5.) = DG. - 5.) = 0 (4.4.1) 


and so the degrees of freedom are a — 1 and b — 1, respectively. For the SS,4z 
term, note that the random quantities 6;;’s defined by 


03; = Vij. — Vi. — Yj 4+-Y.z. 


are subject to the conditions: 


a 
) > 6;; =0, for each j (b relations), (4.4.2) 
i=l 
b 
6;; = 0, foreach i (a relations). (4.4.3) 
j=l 


However, in effect, there are only a + b — 1 independent restrictions on the 
6;;’s. This is so since b restrictions (4.4.2), when summed, determine the 
relation 


b a 
j=1 \i=1 
Similarly, restrictions (4.4.3), when summed, must also be zero; that 1s, 
a b 
i=l \j=1 


Thus, only a — 1 of the a relations (4.4.3) will be independent and the total 
number of independent restrictions on 6;;’s is a+b — 1. Since the number 
of 6;;’s is ab, the number of degrees of freedom associated with the SSaz 
term 1S 


ab —(a+b—1)=(a-—-1)(b—1). 


By subtraction, the number of degrees of freedom associated with the error 
term SSp is 


abn — (a —1)—(b— 1) —(a—1)(b—1) =ab(n— 1). 


Two-Way Crossed Classification with Interaction 185 


The corresponding mean squares MS,, MSz, MS,az, and MS, are, therefore, 
defined by 


SS SS 
MS,=——", MSs=7—, 
SS 4B SSE 
MS, = —— 8 MSp = — oe. 
B= Gr iybop 2 MSe= ap 


Next, we examine the expectations of mean squares, which can be readily 
derived from the assumptions of the model (4.1.1) and the usual laws of ex- 
pectations. On taking successive averages of the model equation (4.1.1), we 
obtain 


Vij, = U+ a; + Bj + (@B)ij + 4i;., (4.4.4) 

y. =wt+a; +h + (Bp); +, (4.4.5) 

yj =uU+a,+ Bj +B); +2é;, (4.4.6) 
and 

y..=eta.+B+(@p).+2é.., (4.4.7) 


where, as usual, the bars indicate means over the subscripts shown by dots. 
Substituting the values of yijx, Vij., Vi.., ¥.j., and y.. given by (4.1.1), (4.4.4), 
(4.4.5), (4.4.6), and (4.4.7), respectively, into the expressions for the SS4, SSz, 
SSap, and SS¢- terms defined in (4.3.1), we obtain the following expressions: 


SS = ye 7 Yeu éi;.)’, (4.4.8) 


i=] j=l k= 


SSaz = nD leh - (wB);. — (@B).j + (@B). 


i=l j= 
+6 —@,,-@; +2)’, (4.4.9) 


b 
SSg =an) [B; — B + (@B).j; —(@B). +2; -2@.7, (4.4.10) 


j=l 


and 


SS4 = bn ) “lo; — & + (@B):. — @B). + &.. — 2). (4.4.11) 
i=] 


186 The Analysis of Variance 


Now, because the e;;,’s are uncorrelated and identically distributed with mean 
zero and variance o?, it follows that 


E(e,) =o; (4.4.12) 

E(é;.) =o; /n, (4.4.13) 

E(é ) = 0; /bn, (4.4.14) 

E(é,) =o; /an, (4.4.15) 
and 

E(& ) =o} /abn. (4.4.16) 


It is then a matter of straightforward computation to derive the expectations of 
mean squares. First, taking the expectation of (4.4.8), we obtain 


b n 


E(SSz)= >) 9) Y> Eleije — 21.” 


i=1 j=l k=1 


= ab(n — 1)o?. (4.4.17) 


The expectation of MSz 1s, therefore, given by 


_p(_S8e_\_.2 
E(MS.E) = &(=- = >) =O). (4.4.18) 


Note that the result (4.4.18) is true under the assumptions of the fixed and 
random, as well as mixed effects models. 

Now, to derive the expectations of MS,g, MSz, and MS,a, we consider the 
cases of Models I, II, and III separately. 


MODEL | (FIXED EFFECTS) 


Under Model I, the a;'S, Bj’s, and (af); ;’s are fixed quantities with the restric- 
tions that & = B = (aB); = (a@B). j = (af). = 0. Therefore, the expressions 


Two-Way Crossed Classification with Interaction 187 


(4.4.9) through (4.4.11) reduce to 


a b 
SSasn =n >) [(oB)ij +2. —&.- 2). 42.7, (4.4.19) 
i=] j=1 
b 
SSg =an) [Bj +2). -@.), (4.4.20) 
j=l 
and 
SS, = bn) lo; +4. — 2). (4.4.21) 
i=1 


On taking the expectation of (4.4.19), we obtain 


E(SSaz) =n S 3 E((@B)ij + @j. — 2. — 2). +2@..P 


i=l j=1 


=n > De; +n ; > E(@ij,.—&.—@;.+2.)°, (4.4.22) 


i=1 j= 


since the (a@B);;’s are constants and the expectation of the cross-product term is 
zero. By proceeding as in the derivation of the result (3.4.11), it can be readily 
shown that 


a b 2 
Yl Ey. —%.. — 2. +2. =@-Db-Y)=. (4.4.23) 
; ; n 
i=1 j=1 


Then, on substituting (4.4.23) into (4.4.22), we obtain 
a b 
E(SSas)=n )_ > “(oB);, + (a — 1b — 10? (4.4.24) 
i=1 j=l 
Therefore, the expectation of MS,z is given by 


a b 


n> ep) 


_ SSB 2 i=l j=l 
E(MS ap) =E ($4 — Ib ~ =} = 0, + (a—-1(b—-1) 7 1b _ 1) . (4.4.25) 


188 | The Analysis of Variance 


To find the expectation of MSz, we note from (4.4.20) that 
b b 
E(SSz) = an ps B: +E KG — “| ; (4.4.26) 
jJ=1 jJ=1 


since §;’s are constants and the expectation of the cross-product term is zero. 
Now, using the results (4.4.15) and (4.4.16), we find that 


J=!1 j=1 
2 o2 
— p—*& — p—£ 
an abn 
b-1 , 
= O;. (4.4.27) 
an 


Then, on substituting (4.4.27) into (4.4.26), we obtain 
b 
E(SSg) = an YB; + (b= lop. (4.4.28) 


j=l 


Therefore, the expectation of MSz 1s given by 


SSz > an Qo, 
E(MSzg) = E = —— - 4.4.29 
(MSz) (—) vet 5 LB (4.4.29) 
Finally, from symmetry, it follows that the expectation of MS, 1s given by 


bn <= 
E(MS,) = 02 + — dw (4.4.30) 


MODEL II (RANDOM EFFECTS) 


Under Model II, the a;’s, B;’s, and (a@B);;’s are mutually and completely un- 
correlated random variables with mean zero and variances o2 and o2 re- 
spectively. It then follows, using the formulae for the variances oF the sampling 


distribution of the means of a@;’s, B;’s, and (wB);;’s, that 


E(a7) =o; (4.4.31) 
E(@’) =o, /a, (4.4.32) 
E(B;) = 9, (4.4.33) 


Two-Way Crossed Classification with Interaction 189 


E(B’) =o} /b, (4.4.34) 
E[(oB);,| = o58, (4.4.35) 
E[(aB); | = 02/0, (4.4.36) 
E[(@B)’,] = o3,/a, (4.4.37) 
and 
E((a@B)?] = ogg /ab. (4.4.38) 


First, taking the expectation of (4.4.9), we obtain 
a b — __ __ 
E(SSaz) =n Y) > — El(@eB)ij — (@B)i. — (@B).j + (@B).P 
i=1 j=l 
+n 2 > E(é;,—@,. —@;.+é@.], (4.4.39) 


since the expectation of the cross-product term is zero. By proceeding as in the 
derivation of the result (3.4.11), it can be shown that 


a b 


S> >> El(@B)i; — @B):. — @B).j + @B).1 = (a -— DG - Nojg (4.4.40) 


i=1 j=l 


and 
a b 2 


Elé; —&. —€;. +é.2 =(a—-Ib—-1)-. (4.4.41) 
" . n 
i=l j=!i 


On substituting (4.4.40) and (4.4.41) into (4.4.39), we obtain 
E(SSag) = (a — 1)(b — 1)[o; + nog]. (4.4.42) 
Therefore, the expectation of MS, 1s given by 


SSB 


EMS ap) = ez — 1b — 1) 


= a? + NO gp: (4.4.43) 


190 The Analysis of Variance 


Next, taking the expectation of (4.4.10), we obtain 
b ; bo _ 
E(SSg) = an \ (8; — BY + EY (@B).; — @B).” 
jJ=1 jJ=!1 
b 
+E KG jn :* (4.4.44) 
j=1 


since again the expectations of the cross-product terms are zero. Using the 
results (4.4.33) and (4.4.34), it follows that 


b 
E) (8; — BY =(b — l)o5. (4.4.45) 
j=1 


J 


Similarly, using (4.4.37) and (4.4.38), we have 


b 2 
Ey (GB), — @B).P = - 1); (4.4.46) 
j=l 
and, finally, using (4.4.15) and (4.4.16), we have 
5 os oe 
EE) @;.-@. =- IN. (4.4.47) 


Substituting (4.4.45), (4.4.46), and (4.4.47) into (4.4.44), we obtain 
E(SSg) = (b — l)[o7 + nog, + anos]. (4.4.48) 
The expectation of MSz is, therefore, given by 


SSp 
b—1] 


E(MSzg) = E( ) =oo+ No gg + anop. (4.4.49) 


Finally, from symmetry, it follows that 


E(MSq) = 0; + noj, + bnoj. (4.4.50) 


MODEL III (MIXED EFFECTS) 


Under Model II, the «;’s are constants with the restriction that }-_, a; = 0; B;’s 
are uncorrelated random variables with mean zero and variance op, and (@“B);;’s 
are random variables with mean zero and variance-covariance structure given 
by (4.2.1), and subject to the constraints )“"_, (wB);; =0 for all j. Furthermore, 
the B;’s, (wB);;’s, and e;;,’s are uncorrelated with each other. It then follows, 


Two-Way Crossed Classification with Interaction 191 


using the formulae for the variances of the sampling distributions of the means 
of the B;’s and (@B);;’s that 


E(B;) = 95, (4.4.51) 
E(B’) = of /b, (4.4.52) 
2 a—-l1, 
E|(aB);;] = ——op: (4.4.53) 
and 
E[(@B);] = — ‘2 (4.4.54) 
I. ab ap oe 
Now, under the restrictions that @ = (aB). j= (aB). = (, the expressions 


(4.4.9) through (4.4.11) reduce to 


a b 
SSag =n 2 Y @B)ii — (a@B); +2. -—@.-2@; +2)’, (4.4.55) 


i=) j=l 


b 
SSg =an) [pj —B +é;.-2.7, (4.4.56) 
j=l 
and 
SS4 = bn) [oj + @B)i. + &.. — 21. (4.4.57) 


i=] 


First, taking the expectation of (4.4.55), we obtain 


a b a b 
E(SSas)=n >> )> El(@B):; -@Bi.P +n >_> Eley —%..-2,. +22, 
i=] j=] i=1 j=1 


(4.4.58) 


since the expectation of the cross-product term is zero. Using the results (4.4.53) 
and (4.4.54), we obtain 


ab a b 
> > El@B); — @B.P =D bs E(aB);; — od 


i=1 j=l i=1 | j=! 


— ] — ] 
i=] a ab 


= (a — 1)(b — l)ogg. (4.4.59) 


192 The Analysis of Variance 


Furthermore, as shown in (3.4.11), we have 


2 


a b og 
DD Ey. — 2%. = 25, +8.Y = (a 1b 1)-*. 


i=1 j=1 
Substituting (4.4.59) and (4.4.60) into (4.4.58), we obtain 
E(SSag) = (a — 1)(b — 1) [of + noZe]- 
Therefore, the expectation of MS, z is given by 


SSAB 


EMS ab) = Q(z —~1(b— 1) 


) = a? + NO gp. 


Next, taking the expectation of (4.4.56), we have 


b 


J 


b 
E(SSp) = an E YB; —_ By +E KG _ a . 
j==1 j=1 


jJ= 


As in (4.4.45) and (4.4.47), it is easy to verify that 


b 
E) (8B, - BY =(b- 105 


j=l 


and 


Substituting (4.4.64) and (4.4.65) into (4.4.63), we obtain 
E(SSz) = (b—- 1) [o? + ano; |. 
So that the expectation of MSz is given by 


SSp 
b-—1 


E(MSp) = E( ) = 0; +ano5. 


Finally, taking the expectation of (4.4.57), we obtain 


E(SS4) = bn bs a +E) GB) +E &. - “I 
i=] i=] i=] 


(4.4.60) 


(4.4.61) 


(4.4.62) 


(4.4.63) 


(4.4.64) 


(4.4.65) 


(4.4.66) 


(4.4.67) 


(4.4.68) 


Two-Way Crossed Classification with Interaction 193 


since the @;’s are constants and the expectations of the cross-product terms are 
zero. From (4.4.14), (4.4.16), and (4.4.54), it readily follows that 


E wi _* , ‘02, (4.4.69) 
and 
E . —~é ~=(a-1 oe (4.4.70) 
i=l bn 


Substituting (4.4.69) and (4.4.70) into (4.4.68), we obtain 


E(SSq) = (a — lo? + (a — I)noz, + bn Ya. 


i=] 


Hence, the expectation of MS, is given by 


E(MS 4) = E(- — .) = 0, + NOg,g + a1 


The foregoing results of Sections 4.3 and 4.4 can now be summarized in a 
tabular form as the analysis of variance table shown in Table 4.2. 


4.5 SAMPLING DISTRIBUTION OF MEAN SQUARES 


In this section, we give the distribution results on mean squares for the fixed, 
random, and mixed effects models. The derivation of these results 1s beyond 
the scope of this volume and can be found in Scheffé (1959, pp. 109-112), 
Graybill (1961, pp. 397-402; 1976, pp. 630-632), and Searle et al. (1992, pp. 
131-132). Note that although in the derivation of the expected mean squares 
we have not made any distribution assumption about the form of the random 
components of the model (4.1.1), we do require the assumption of normality to 
derive their sampling distributions. 


MODEL | (FIXED EFFECTS) 
Under the distribution assumptions of Model I, it can be shown that: 


(a) The quantities MSz, MS4zg, MSz, and MS, are statistically indepen- 
dent. 
(b) The following results are true: 


(1) 


MSz _ x*[ab(n — 1)] 
o2 ab(n — 1) 


e 


. (4.5.1) 


The Analysis of Variance 


194 


Hl jspow 


» 


Pou + 70 
foun 4 9° 70u + 20 


7ouq + Mou + ; 90 


Il jPPOW 


j=f [= 


day gx A= (I= OU=) 4, 


1-4 
q 
I=! op 
+) + 20 
‘ Ss uq 
| jPpow 


aaenbs uray pa}dedx3 


ASW 


IVC 


SW 


YSN 


aaenbs 
ueaw 


“SS 
4Ss 


"VSS 


4SS 


YSS 


saaenbs 
jo wing 


| — uqnv 
(I — ujqo 


(1 — 9) — ¥) 


1-4 


] —0D 


wopaa4 
JO saaidaq 


[BIOL 


JOUG 


qx Vv 
uoToRIA}U] 


g ol anq 


y 0} ang 


UOIJELILA 
JO 391Nn0S 


(L°L') JAPOW 404 aouRLIeA Jo siskjeuY 


Cv A1aVL 


Two-Way Crossed Classification with Interaction 


195 
(ii) 
MS "(a — 1) (b—1),A 
AB X [(a — 1) ( ) AB] (4.5.2) 
o? (a — 1)(b—1) 
(iii) 
MSg  x7[b—1,Az] 
re oe) og 4.5.3 
a2 b—1 
and 
(iv) 
MS la —1,4 
MPa XBT Aad (4.5.4) 
0; a—1l1 


where, as usual, x*[-] denotes a central and x? [- ,‘] denotes a non- 


central chi-square variable with respective degrees of freedom and the 
noncentrality parameters 448, Ag, and A, defined by 


b 
an 
Ap=—) 8’, 
Io? J 
e J=1 
and 
bn < 
A= — > a. 
202 , 
e i=] 


It, therefore, follows from (4.5.2) through (4.5.4) that 
(ii)’ If (@B);; = 0, for all i and 7, then 


MSas  x2lla—1)@-1)] 
~ ; 4.5.5 
a (@-lh@-1) Go) 
(ii)’ If B; =, for all j, then 
MS, x2[b-1] 
~ ——_; 4.5.6 
a2 b-—1 ( ) 


196 The Analysis of Variance 


(iv) If a; =0, for all 7, then 


MS, x7*[a — 1] 


2 _ 
0; a—1 


(4.5.7) 


MODEL II (RANDOM EFFECTS) 
Under the distribution assumptions of Model II, it can be shown that: 


(a) The quantities MS-, MS,g, MSz, and MS, are statistically indepen- 


dent. 
(b) The following results are true: 
(1) 
MS *Tab(n —1 
EK lab(n— DI (4.5.8) 
o2 ab(n — 1) 
(11) 
MS *\(a—1)(b-1 
AB _ xl — 1)¢ Me (4.5.9) 
Of; + no %4 (a —1)(b—-1) 
(iii) 
MS *Ibh—1 
8 _. ~ xe 7 (4.5.10) 
0; +nogg + ano, b—] 
and 
(iv) 
MS *la—1 
"4 ~ xa (4.5.11) 
o2+ NO ig + bno~ a—1 


That is, the ratio of MS; to a2 isa x7[ab(n — 1)] variable divided by ab(n — 1); 
the ratio of MS,z to 02 + NO ip is a x7[(a — 1)(b — 1)] variable divided by 
(a — 1)(b — 1); the ratio of MSz to 02 + Nop + ano; is a x°[b — 1] variable 
divided by b — 1; and the ratio of MS, to 07 + no2, + bnoz, isa x*[a — 1] 
variable divided by a — 1. 


MODEL III (MIXED EFFECTS) 


Under the distribution assumptions of Model III, it can be shown that: 


Two-Way Crossed Classification with Interaction 


197 


(a) The quantities MS-, MSaz, MSz, and MS, are statistically indepen- 


dent. 
(b) The following results are true: 
(1) 
MS- — x’[ab(n — 1)] 
oa? ab(n~—1) 
(11) 
MSag x*I(a- Db 1) 
a2 +noig (a—1)(b—1) ° 
(111) 
MS=z3 x*[b- 1] 
o2 + ano%, b—-1 ’ 
and 
(iv) 
MS, x7[a —1,A,4] 
a a a 
o2 + NO op a—1 
where 
b a 
dA ” a? 


It then follows from (4.5.15) that if vw; = O for all 7, then 
(iv)’ 


MS, x*[a—1] 
a2 +noig a-1 


4.6 TESTS OF HYPOTHESES: THE ANALYSIS 
OF VARIANCE F TESTS 


(4.5.12) 


(4.5.13) 


(4.5.14) 


(4.5.15) 


(4.5.16) 


In this section, we present the usual hypotheses of interest and appropriate F 
tests for fixed, random, and mixed effects models. As usual, the test statistic 
is constructed by comparing two mean squares that have the same expectation 


198 The Analysis of Variance 


under Hp and the numerator mean square has a larger expectation than the 
denominator mean square under A}. 


MODEL | (FIXED EFFECTS) 


The usual tests of hypotheses of interest are about AB interactions, factor B 
effects, and factor A effects. 


Test for AB Interactions 
Ordinarily the two-way classification study begins with a test to determine 


whether the two factors interact. The hypothesis 1s 


Hy’? : all (@B);;’s = 0 
versus (4.6.1) 
H;'? : not all (@B);;’s are zero. 


In order to develop a test procedure for the hypothesis (4.6.1), we note from 
(4.4.18) and (4.4.25) that under H;'”, 


E(MSz) = o;, 
E(MSag) = 0; 


and under H/'?, 
E(MS,,3) > E(MSe). 
Furthermore, it follows from (4.5.1) and (4.5.5) that under HH}? ; 


_ MS as /o2 _ MS 4B 


Fap = MS; /o2 = MS; ~ F{(a — 1)(b — 1), ab(n — 1)}. (4.6.2) 


Thus, the statistic (4.6.2) provides a suitable test procedure for (4.6.1); Hg? 
being rejected if 


Fag > Fl(a — 1)\(b— 1), abv — 1);1 -— a]. 


Test for Factor B Effects 
The hypothesis is 


Hy :all 6; =0 
versus (4.6.3) 
H® : not all B;’s are zero. 


Two-Way Crossed Classification with Interaction 199 


In order to develop a test procedure for the hypothesis (4.6.3), we note from 
(4.4.18) and (4.4.29) that under H2, 


E(MSz) = 0, 
E(MSz) = 02; 


and under H,°, 
E(MSz) > E(MSez). 
Furthermore, it follows from (4.5.1) and (4.5.6) that under H?, 


_ MSzg/o; | MSz 
~ MSg/o2 MSe 


Fp ~ F[b—1,ab(n — 1)]. (4.6.4) 


Thus, the statistic (4.6.4) provides a suitable test procedure for (4.6.3); He 
being rejected if 
Fz > F[b—- 1,ab(n — 1);1-—a]. 


Test for Factor A Effects 
The hypothesis is 


Hp‘ : all a; = 0 
versus (4.6.5) 
H} : not all @;’s are zero. 


Proceeding as in the test for factor B effects, it readily follows that the statistic 


_ MSa/o? | MS, 
7 MS; /o2 7 MSe 


Fa ~ Fla —1,ab(n — 1)] 


provides a suitable test procedure for (4.6.5); H¢* being rejected if 
Fy, > Fla—l,ab(n—1);1—-—a]. 


Remarks: (i) If a nonsignificant value of F4g occurs, some authors suggest that the 
MS az and MS; terms of the analysis of variance Table 4.2 be pooled to obtain a better 
estimate of the error term, namely, 


(a _— 1l)(b- 1)MSapz +ab(n —_ 1)MS-e _ SSaB + SS- 
(a —1)(b—1) + ab(n—1) ~ abn-—a-~b+1- 


200 The Analysis of Variance 


The reason put forth is that when no interactions exist, E(MSag) = a2 gives the same 
expectation as for MS,; so that the new estimator of a2 would have a large number of 
degrees of freedom associated with it. However, this practice is not always recommended 
since a nonsignificant F value does not mean that the hypothesis is true. In other words, 
there is always the possibility of some interaction being present and not showing up 
in the F test. Hence, the best estimate of co? is always to be taken as MSz, unless 
the experimenter has additional information confirming the nonexistence of interaction 
terms. Moreover, the pooling procedure affects both the level of significance and the 
power of the tests for factor A and factor B effects, in ways that are not yet fully explored. 
It is generally recommended that the pooling should not be undertaken unless: 


(a) the degrees of freedom associated with MSz¢ are too small; and 
(b) the calculated value of the test statistic MS,g/MSg falls well below the critical 
value. Some authors recommend that MS,z should be nearly equal to MS_. 


Part (a) of this rule is intended to limit pooling to cases where the gains may indeed 
be important, and part (b) is meant to ensure that in fact there are no interactions. For 
some general rules of thumb for deciding when to pool see Paull (1950), Bozivich et al. 
(1956), Srivastava and Bozivich (1962), and Mead et al. (1975). 

(ii) It may be of interest to derive the significance level associated with the experiment 
as a whole. Let a, a2, and a3 be the significance levels of the F ratios F4, Fg, and Faz, 
respectively, and let a be the overall significance comprising all three tests. Then it can 
be shown that (Kimball (1951)) a < 1 — (1 —@,)(1 — a@2)(1 — a3). For example, if a, = 
a = a; = 0.05, thena < 1—(1 —0.05)° = 0.143. Similarly, if a, = a, = a; = 0.01, 
thena < 0.030. 


MODEL II (RANDOM EFFECTS) 


In Model II, as in Model I, the usual hypotheses of interest are about AB 
interaction, factor B effects, and factor A effects. 


Test for AB Interactions 
The presence for interaction terms is tested by the hypothesis 


Hj? O28 = 0 
versus (4.6.6) 


To obtain an appropriate test procedure for the hypothesis (4.6.6), we note from 
(4.4.18) and (4.4.43) that under H}'?, 


E(MSz) = 0}, 
E(MSaz) = 0; 


and under H}*?, 


Two-Way Crossed Classification with Interaction 201 


Furthermore, it follows from (4.5.8) and (4.5.9) that under HH}? 


MS 42 /o2 _ MS 43 


MS;/o2 — MS, ~ *'@~ DO~ Dean DI. (4.6.7) 


Fag = 
Thus, the statistic (4.6.7) provides a suitable test procedure for (4.6.6); Hg'? 
being rejected if 

Fup > F{(a — 1)(b — 1), ab(n — 1); 1 -— a]. 


Test for Factor B Effects 
The presence for factor B effects is tested by the hypothesis 


He OB = 0 
versus (4.6.8) 
HP OB > 0. 


Again, we note from (4.4.43) and (4.4.49) that under H2, 
E(MS apr) = oa? + NO gg, 
E(MS3) = oa? + NO gg; 

and under H;?, 
E(MSz) > E(MSaz). 


Furthermore, it follows from (4.5.9) and (4.5.10) that under H?, 


MSz /(o2 + NO zg) _ MSp 


= MSaz/(o2 +no2,) = MSap ~ F[b —1,(a— 1)(b — 1)]. (4.6.9) 


B 
Thus, the statistic (4.6.9) provides a suitable test procedure for (4.6.8); H} 
being rejected if 

Fp > F[b—1,(a —1)(b—1);1-a]. 


Test for Factor A Effects 
The presence for factor A effects is tested by. the hypothesis 


Hj :02 =0 
versus (4.6.10) 
H} :o2 > 0. 


202 The Analysis of Variance 


Proceeding as in the test for factor B effects, it readily follows that the statistic 


MS,/(o2+nogg) MS, 
F, = ———_——_———- = ——_ ~ Fla —-l, —1\(b-1 4.6.11 
‘= MSen/(o2 tno) ~ MSan la—1,(a—1)(b-1)] (4.6.11) 


provides a suitable test procedure for (4.6.10); H¢' being rejected if 


Fa > Fla—1,(a—1)(6—1);1-a]. 


Remarks: (i) The more general hypotheses of interest may be 


Cup O5 a2 
y) <= Pi, 2 5 < pf, 2 2 P3 
O; Oo; +no Oo + Noy, 


which are tested in the obvious way. 

(ii) One of the most important differences between the tests of hypotheses in Models 
I and II being that when a factor has a random effect, the main effect is tested by using 
the interaction mean square in the denominator, whereas if it has a fixed effect then one 
must divide it by the error mean square. 


MODEL III (MIXED EFFECTS) 


Under Model III, the hypotheses of interest are: Oop = OQ, oF = 0, anda;’s = 0. 
The appropriate tests are obtained in the same way as in the case of Models I 
and II. 


Test for AB Interactions 

The hypothesis Hj'" :oZ, = 0 versus H;*® : 07, > 0 may be tested by the ratio 
MS,4e/MSze, which under Hy? has an F distribution with (a — 1)(b — 1) and 
ab(n — 1) degrees of freedom. Similarly, the hypothesis Ces /o2 < p, can be 
tested in the obvious way by using the statistic (1 + Cup /o2)'(MS AB/MSe). 


Test for Factor B Effects 

The hypothesis Hj’ :0; = 0 versus H/’ :o% > 0 may be tested by the ratio 
MS2/MSz, which under HY has an F distribution with b — 1 and ab(n — 1) 
degrees of freedom. Similarly, the hypothesis OF /o2 < p2 may be tested in the 
obvious way by the test statistic (1 + ano; /o2)-'(MS B/MSe). 


Test for Factor A Effects 

The hypothesis H¢' : all @; = 0 versus H}' : not all a; = 0 may be tested by the 
ratio MS, /MS,z, which under H¢' has an F distribution with a — 1 and (a — 1) 
(b — 1) degrees of freedom. 


Two-Way Crossed Classification with Interaction 203 


Remarks: (i) The tests for the main effects under Model III work conversely to the 
tests under Models I and II. Here, the test statistic for factor B with random effects is 
obtained by dividing MSz by MSz, whereas if this factor had occurred under Model II, 
then the statistic would be obtained by dividing MSzg by MS,z. Similarly, the factor A 
with fixed effects is tested by dividing MS, by MSaz, whereas if the factor had occurred 
under Model I, then the test would have been made by dividing MS, by MSgz. These 
results on tests of hypotheses were first developed by Johnson (1948). 

(ii) If interaction terms are nonsignificant, one may want to test the hypothesis 
HA : all a;’s = Oby the statistic MS 4 /MS¢ which may provide more degrees of freedom 
for the denominator. The possibility of pooling MS,, and MS; may also be considered 
if degrees of freedom are few. The earlier comments on pooling also apply here. 


SUMMARY OF MODELS AND TESTS 


The appropriate test statistics for Models I, II, and III developed in this section 
are summarized in Table 4.3. 


TABLE 4.3 

Test Statistics for Models I, 11, and III 

Hypothesized Model I Model II Model III 
Effect (Aand B Fixed) (Aand BRandom) (A Fixed, B Random) 
Factor A MS,/MSe MSa4/MSae MS, /MSaes 
Factor B MSB/MSe MSB/MSas MS3/MSe 
Interaction AB MSaBp/MSeE MSase/MSeE MSap/MSe 


4.7 POINT ESTIMATION 


In this section, we present results on point estimation for parameters of interest 
under fixed, random, and mixed effects models. 


MODEL | (FIXED EFFECTS) 


The least squares estimators* of the parameters jz, a;’s, B;’s, and (@B);;’s for 
the model (4.1.1) are obtained by minimizing 


a b n 
O=S°>> > big — we — 8 — Bj — BY, (4.7.1) 


i=1 j=l k= 


4 The least squares estimators in this case are the same as those obtained by the maximum 
likelihood method under the assumption of normality. 


204 The Analysis of Variance 


with respect to 4, a;, B;, and (a@B);;; and subject to the restrictions: 


a b a b 
doa: = > Bi = > @B)ii = D_@B)ij = 0. (4.7.2) 
i=] j=l i=] j=l 


When one performs this minimization, it can be shown by the method of ele- 
mentary calculus that the following least squares estimators are obtained: 


f=... (4.7.3) 
a= 7,.-y., i=1,2,...,a, (4.7.4) 
Bb; =5;.-5... F=1,2,...,b, (4.7.5) 


and 


These are the so-called best linear unbiased estimators (BLUE). The variances 
of estimators (4.7.3) through (4.7.6) are: 


Var(ji) = 0, /abn, (4.7.7) 
Var(a@;) = (a — 1)o; /abn, (4.7.8) 
Var(B;) = (b — 1)02 /abn, (4.7.9) 
and 
Var((aB);;) = (a — 1)(b — 1)02 /abn. (4.7.10) 


The other parameters of interest may include 4+ a; (mean levels of factor A), 
4+ B; (mean levels of factor B), +a; + 8B; +(a@B);; (the cell means), pairwise 
differences a; — a;, 8B; — Bj, and the contrasts 


a a b 
Det (sr4=0). | 


b 
j=l = 


= 0) and 
J 


S77 6 /(aB), (Sy = o) 


i=1 j=l i=1 j=l 


1 


Their respective estimates together with variances are given by 


ee 


+a; = ji. Var( 1 4 a) = 02 /bn: (4.7.11) 


e+ Bi = Jj, Var(u+ Bi) =02/an; (4.7.12) 


Two-Way Crossed Classification with Interaction 205 


boy FB) (a@B);; = Vij. 


Var (1 +o; +B; + (eB); =o02/n:; (4.7.13) 
QQ; - ay = Vj. — Vi'_s Var (a —_ a) = 20; /bn; (4.7.14) 
Bj - By =Fj.-57,  Var(B; — B;') = 20? /an; (4.7.15) 


= (> a) o;/bn; (4.7.16) 
b , 
= ( #) ofan; (4.7.17) 


j=l j=) j=l j=l 
and 
a b a 
£;;(oB)i; = 4460.4 Vi. — Ij +I)3 (4.7.18) 
i=] j=l i=] j= 
a a a b 1 <2 ; 1 b ; ; 
va( 3° GAs) = (LY G- FL e-Lye) azn 
i=] j=1 i=l j=1 =] j=l 


(4.7.19) 


The best quadratic unbiased estimator of o? is, of course, provided by the error 
mean square. 


MODEL II (RANDOM EFFECTS) 


In the case of the random effects model, for random factors (that have significant 
effects), one would often like to estimate the magnitude of the variance com- 
ponents. As before, unbiased point estimators can be readily obtained by using 
linear combinations of the expected mean squares in the analysis of variance 
Table 4.2. For instance, 02 can be estimated by noting that 


E(MS,) — E(MSaz) = bnoZ. 
Hence, an unbiased estimator of o2 is given by 


>» MS,—MS,p 


G2 A 4.7.20 
ee = (4.7.20) 
Similarly, 
MSs — MS 
63 = 5 as (4.7.21) 
an 
MS,s — MS 
62,2 — (4.7.22) 


206 The Analysis of Variance 


and 


6? =MSz. (4.7.23) 


As usual, the parameter pz is, of course, estimated by f = y... The estimators 
(4.7.20) through (4.7.23) are the so-called minimum variance quadratic unbi- 
ased estimators or the minimum variance unbiased estimators under the assump- 
tion of normality. They may, however, produce negative estimates.° If one can 
assume the lack of interaction terms, it is possible to use MS; in place of MS,p 
in the estimators (4.7.20) and (4.7.21). Alternatively, one can pool MS, and 
MS, and use this pooled estimator for MS 4, in the expressions for 02 and Op. 


MODEL III (MIXED EFFECTS) 


In the case of a mixed effects model with A fixed and B random, the param- 


eters to be estimated are f, a;’s, 0f,03,, and o. From the expected mean 


squares column under Model III of Table 4.2, the o7, Cig and o? are estimated 
unbiasedly by 


MS; — MS 

63 = —OB UT. (4.7.24) 

an 

MSap — MS 

2, =—“__—_, (4.7.25) 

n 

and 

G? = MSe. (4.7.26) 


Thus, althuogh Oxp and o? have the same estimates as in the case of the random 


effects model, the estimate of Op is different. This general approach of estimat- 
ing variance components can be used in any mixed model. After deleting the 
mean squares containing fixed factors, the remaining set of equations can be 
solved for variance components.® 

The fixed effects jz, 4 + a;, and @;’s are, of course, estimated by 


f=... 
oo 


+a; = yj., i= 1, 2,...,4a, 


5 For a discussion of the nonnegative maximum likelihood estimation, see Herbach (1959) and 
Miller (1977). 

6 For a discussion of the maximum likelihood estimation in the mixed model, see Szatrowski and 
Miller (1980). 


Two-Way Crossed Classification with Interaction 207 


which are the same estimates as in the case of a fixed effects model. Furthermore, 
comparisons involving pairwise contrasts can be estimated by 


ee 
ji — Ay = Vi.. — Ji'..- 


To evaluate the variances of the means and contrasts, we note that 


—] 
oa? +n ie 62, + ano, 
— a 
Var(y...) = 
abn 
—] 
a? +n @ 62, + nog 
Var(y;.) = G (4.7.27) 
bn 
—] 
a2 +n te 62, + nos 
— a 
Var( yj.) = ————_ 
n 
o2 
Cov(5j... Fv.) = ——e (4.7.28) 
ab 
and 
2(o7 + no? 
Var(¥i.. — Wv.) = (7. ™ a) (4.7.29) 


4.8 INTERVAL ESTIMATION 


In this section, we present results on confidence intervals for parameters of 
interest under fixed, random, and mixed effects models. 


MODEL I (FIXED EFFECTS) 


An exact confidence interval for o? can be based on the chi-square distribution 
of ab(n — 1)MS_/o2. Thus, a 1 — & level confidence interval for a? is 


ab(n — 1)MSe 2 ab(n — 1)MSe 


Gaba —),1—-a/2]) ~~ yabw apy =? 


Furthermore, it is possible to obtain confidence intervals based on the ¢ distri- 
bution for a particular @; or a particular difference a; — a;. For example, 


(¥;.. — y..) —t[ab(n — 1), 1 — a@/2)/(a — 1)MS_e/abn 
< a < (3. —y.) + tlab(n — 1), 1 —@/2],/(a — 1)MSg/abn, (4.8.2) 


defines a 1 — @ level confidence interval for @;. 


208 The Analysis of Variance 


Similarly, to obtain confidence limits for a; — a@;, we note from (4.7.14) 
that 


E(¥j.. — Vir.) = Oj — ag 
and 


Var(¥i.. — Yi.) = 20; /bn. 
The confidence limits can, therefore, be derived from the relation: 


(Vi. — Yi.) — (Qi — a7’) 


V/ 2MS_/bn 


Similar results on confidence intervals for the 6;’s, (@B);;’s, and any pair- 
wise differences on them, using the ¢ distribution, can also be obtained. How- 
ever, multiple comparison methods, discussed in Section 4.12, are usually 
preferable. 


~ tlab(n — 1)]. (4.8.3) 


MODEL II (RANDOM EFFECTS) 


Anexact confidence interval for the error variance component @? is obtained as 
in (4.8.1). However, as indicated in Section 2.10 for the one-way random model, 
exact confidence intervals for the variance components O5p» Op and o2 do not 
exist. One can, nevertheless, obtain exact confidence intervals for 07 + NOgs» 
oa; +noj, + bnoj, 0; +nojg + anog; and ratios of particular combinations of 
variance components, for example, 04/0703 /(07 +nogg), 04 / (0; +nojg), 
and (a; + noj,)/(o; + nojg + bnoz), by taking the appropriate ratios of 
mean squares as discussed in Section 2.10. For a discussion of approximate 
confidence intervals for the variance components Oep» Op o2; for the ratios of 
variance components 03/07, 0;/0;,0,/03; and the proportions of variability 
a7 /(o2 + O%p + oF + a), o5g/(o2 + Cis + o% + a2), a3 /(a; + Ces + oF + a), 
and of /(a; + og +03 +0), including numerical examples, see Burdick and 
Graybill (1992, pp. 121-124). 


To obtain confidence limits for 4, we note from (4.4.7) that 


E(y...) = ph (4.8.4) 
and 


a7 + NO dp + ano; + bno? 


Var(¥...) = (4.8.5) 


abn 


In order to get a mean square with expected value equal to the numerator in 


Two-Way Crossed Classification with Interaction 209 


(4.8.5), we use the linear combination MS, + MSz — MSaz since it has the 
expected value given by 


E(MS,4 + MSzg — MSag) = 0; + noj, + anog + bnoj. (4.8.6) 
It can now be shown using the Satterthwaite procedure (see Appendix K) that 


y... — pb 


V (MS, + MS, — MSap)/abn 


~ approx. t[v], (4.8.7) 


where 


= [MS4 + MSz — MSas]’ 
(MS,)?>— (MSz)’ (MSapy 
a-1' b-1l' @-1b—-]) 


(4.8.8) 


The confidence limits for jz can now be determined in the usual way, but these 
will be imprecise because of the approximation involved in (4.8.7). 


MODEL III (MIXED EFFECTS) 


An exact confidence interval for oa? is constructed as in (4.8.1). However, as 
in Model II, exact confidence intervals for 07 and Oc, do not exist. One can, 
nevertheless, obtain exact intervals for o/,/07 and 0/07 by basing the proce- 
dure on the statistics MS4zg/MS- and MSz/MSz, respectively. Approximate 
confidence intervals for Op and Oop can be constructed by the method of Sat- 
terthwaite and other related procedures (see, e.g., Burdick and Graybill, 1992, 
p. 153)). Also, as in the case of Model I, it is possible to obtain confidence 
intervals for j2, a, & + aj, &; — a, or the contrast )V_, £:0; ()_, 4: = 0). 
For example, an exact confidence interval for ran £;a;, with coefficient 1 —a, 
is given by 


S- ei51.. — t[(@ — 1b — 1), 1 —@/2] |MSan >> e7/bn < D0 bai 
i=] i=] i=] 


a 


< > 45... + t[(a@— @ —1),1—@/2] |MSaz > e7/bn. (4.8.9) 


. 


i=l i=] 


Thus, when dealing with a mixed model, the appropriate mean square to 
be used in the estimated variance formula is no longer MS<_. A simple rule to 


210 The Analysis of Variance 


determine the appropriate mean square is: use the mean square employed in 
the denominator of the test statistic for testing the presence of the fixed factor 
under consideration. For instance, with the mixed model (4.1.1) where A 1s 
fixed and B random, MSaz is the appropriate mean square (see Table 4.3). 
The degrees of freedom in constructing the confidence interval are those asso- 
ciated with the mean square utilized for estimating the variance of the contrast. 
However, it is not always possible to obtain an appropriate mean square for 
the desired variance estimate. For example, in order to estimate 2 + a@;, we 
notice from (4.7.27) and Table 4.2 that there is no appropriate mean square to 
estimate the desired variance. An unbiased estimate of Var(y,..), however, can 
be obtained by using an appropriate linear combination of the mean squares. 
An approximate 1 — @ level confidence interval for 4 + a; can be constructed 
using 


yi. Etlv, 1 —a/2] 


where the degrees of freedom v will be estimated using Satterthwaite procedure 
similar to equation (4.8.8). 


4.9 COMPUTATIONAL FORMULAE AND PROCEDURE 


As in Section 3.9, we use the following computational formulae, which are 
identical to the definitional formulae given in (4.3.1): 


a boon 2 
Sr = ddd Min =, 


i=) j=l k= 
a y? 
SSa=—) yo -—, 
A n d» abn 
1 y? 
SS3 = — —— 
Ban 24: abn 
l a b l a l b y? 
SSagp = 2 a a po 
AB n dy dM bn an Yi. an Dv abn 
and 
b n 1 a b 


SSe=)U, Yim ~~ DD Yip 


Two-Way Crossed Classification with Interaction 211 


where as before a dot in the subscript indicates the total over the variable 
represented by the index. Ordinarily, the interaction sum of squares is obtained 
by the relation 


SSasp = SS7 — SS4 — SSz — SS_E 
Or 


SSap = SSrc — SS, — SSz, 


where 


The computational procedure for the sums of squares can thus be performed 
in a systematic manner by the following sequence of steps: 


(1) Compute the cell totals: y11., yi2.,.--, Yap.- 
(11) Compute the row totals: y).., y2..,---5 Ya..- 
(111) Compute the column totals: yj, y.2.,..-, Y.o.- 


(iv) Compute the overall or grand total: 

a b ab 
y= yo yi. = yoy = Yo > vii. 
i=1 j=l 


i=1 j=l 


(v) Compute the raw sum of squares: 


a 


b 

2 2 2 2 
» Vijk = Yin + Yi2 + +++ + Vaon- 
i=1 j=l k=] 


(vi) Compute the correction factor: 


(vi1) Compute 


212 The Analysis of Variance 


(viii) Compute 


(ix) Compute 


1 a b 5 
~ Lie 
MZ) j=! 
(x) Compute 
SSr = Vig — 
i=l j=l k=1 abn 
(x1) Compute 
l a b y? 
SStc = - — 
ren di abn 


(x11) Compute 


(xii) Compute 


I y 
SS; = — 24, 
5 an dy abn 


j= 


(xiv) Compute SSap = SSr — SS, — SSz —SSe = SSrc — SS, — SSz. 
(xv) Compute SS- = SS7 — SSrc. 


4.10 ANALYSIS OF VARIANCE WITH UNEQUAL SAMPLE 
SIZES PER CELL 


In the two-way classification model discussed in this chapter, it has been as- 
sumed that there are the same number of replications of the experiment in each 
cell of the two-way Table 4.1. If this is not true, it is not possible to partition 
the overall sum of squares into independent components due to main effects 
and interaction terms. Care should always be taken to ensure that the number 
of observations in each cell is constant; but, even with the utmost care, it may 


Two-Way Crossed Classification with Interaction 213 


happen for a variety of reasons, such as loss of subjects, incomplete records, 
and the like, that an experiment terminates with unequal sample sizes per cell. 
Moreover, unequal sample sizes are fairly common with many survey-type data. 
For example, an agricultural analyst may wish to study the effect of tempera- 
ture and precipitation on production of certain agricultural crops from data for 
certain counties in the country. In this type of uncontrolled study, it may easily 
happen that the number of counties in the various temperature-precipitation 
categories are not equal. 

The model in this case remains the same, except that the sample size cor- 
responding to the i-th level of factor A and the j-th level of factor B is now 
denoted by n;;. The data layout for a general two-way crossed classification 
with unequal subclass numbers is displayed in Table 4.4. Now, the total number 
of observations for the i-th level of factor A is 


for the j-th level of factor B is 


b 
nj ) Nij» 


i=l 
and the total number of observations is 


a b 


a b 
N= yoni, = yon; = Yoyo ni. 
jJ=1 


i=] i=l j=l 


When only a few values are missing, one could replace a missing value by 
the mean for that cell. The standard analysis of variance can then be performed 
except that for each missing value being estimated, the error degrees of free- 
dom are reduced by one. If there is a wide disparity between the numbers of 
observations in different cells, one can no longer use the standard analysis of 
variance described earlier in this chapter and must resort to some other proce- 
dures. When the sample sizes for each cell are unequal, the two-way analysis of 
variance for factor effects becomes complex. The component sums of squares in 
the analysis of variance are no longer orthogonal; that is, they do not sum to the 
total sum of squares. The least squares method for obtaining the best estimates 
of the parameters is rather complicated in the fixed effects model and the best 
analysis has not been and probably will not be found for the random effects 
models. In the following, we consider some common methods of analysis of 
variance for unequal sample size data. For a concise and readable account of 
the nonorthogonal two-way analysis of variance, see Herr and Gaebelin (1978). 


WUqng 6+ ++ 67TqDK 61 qNK see Mulng oes ‘7fog 6, {0K tae LPUENK 6+ + + 6TEDK § 1 EDK CPUTDK 6+ 6 © 67ZOK ZK IPulDK 6+ ++ $ZIOK 119K ey 


The Analysis of Variance 


Muque 6s ++ §TqIK S1QIK ae Mule e+ 6elig 61 LIK Lee Hug iK 6+ ++ §TEIK S1EIK CUTIK 6+ + © §UTIK §1TIK Vupig 699 OTe STK ly 
: y A0poei 


HEUGEK ©+ ++ *ZUEK 1GEK wwe ULE Oe *OLEK LEK ww. HEME EK 8+ HZEEK EEK «= EUZEK & + +» *7ZEK “IZEK uleg «+++ ‘Z1EK ‘IlEK fy 
UGC K S++ §ZITK NITE we RUE EET LTR RUE Oe HZETK TETK —- LUA ++ + 777K 1A ulZK sss 717K IT ty 
MuqiK es = + *7QIK TK wee TUT Oe  EOETK TLTK ke. i, Ca A lO AF, OC A OAT, MurtC s+ STK TK ly 
tae f tae 
1g ‘d &q cq ‘g 
g A0joR4 


[J2D 4ad suoinesiday “u yyM uoHedIpIsse]> passosy APMA-OML B 105 PIE 
vb d1aVL 


214 


Two-Way Crossed Classification with Interaction 215 


Further details on nonorthogonal two-way analysis of variance models can be 
found in a special issue of Communications in Statistics: Part A, Theory and 
Methods (Vol. 9, No. 2, 1980). 


FIXED EFFECTS ANALYSIS 


We discuss two methods for fixed effects analysis; one for the case of propor- 
tional frequencies and the other for the general case of unequal frequencies. 


Proportional Frequencies 
Sometimes, the unequal sample sizes follow a proportional pattern;’ that 1s, 


Nj Nn, j 


N (4.10.1) 


nij= 
The relation (4.10.1) implies that the sample sizes 1n any of the rows or columns 
are proportional. This is called the case of proportional frequencies. 
When the frequencies are proportional, the analysis of variance discussed 
earlier in this chapter can be employed with suitable modifications. For example, 
the definitional and computational formulae for the sums of squares are: 


Nij b Nnij 2 
y.. 
ss ES Pow 3 E ESO 
i=] j=l k= i=1 j=l k=1 
a a 2 2 
_ yi y 
SSa = Dn; —y y= —- — =, 
i=l jay i N 
b b y?, y? 
SSp = nj. — 3.2 = I, 
j=l j=l | 
a b 
SS4aB = > Y> nij 5ip.. — ¥.-9y. +5.) (4.10.2) 
i=l j=l 
fal jet Mig jay Mf GN 


and 


a b Nij 


i=1 j=1 k=] 


T It is not necessary to check that the number of replicates nj; in each of the ab cells follows the 
relation (4.10.1). Only one need check one cell in each of a — 1 levels of factor A and one in 
each of b — 1 levels of factor B (Huck and Layne (1974)). 


216 The Analysis of Variance 


TABLE 4.5 
Analysis of Variance for the Unbalanced Fixed Effects Model in 
(4.1.1) with Proportional Frequencies 


Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square 
2 . 
Due to A a—1 SSA MSa Oo, + aol dni 
2 ly 2 
Due to B b-1 SSp_- MSg_—ég ace 
ot + 
Interaction (a—1)b-1)  SSaz_ MSas ee + 7 EH 7 = SY mstoot 
AxB i=l j 
Error N—ab SSE MSe a2 
Total N-1 SSr 
where 
nij 
Vij = \_ vile Vij. = Vij. /Nij> 
k=1 
b 
Yi. = 3 Vij. Yi. = Vi. /Ni., 
j=l 


Vj. = Yo ij. yj =yj/nj, 


and 


b 
y= > > do vies 9. = y./N- 


The analysis of variance including the expected mean squares is given in 
Table 4.5. Tests of hypotheses for main effects and interaction can be carried out 
as before for the case of an equal number of observations per cell. For example, 
for testing the interaction effects, the statistic is MS4g/MS_. Under the null 
hypothesis Hy‘? : (aB),;; = 0 for all i and j, this ratio has an F distribution 
with (a — 1)(b — 1) and N — ab degrees of freedom; and the null hypothesis 
is rejected for large values of this ratio. If there are no significant interaction 
effects, the main effect due to factor A is tested by the statistic MS4/MSz 
which, under the null hypothesis Hy :a@; = 0 for all i, has an F distribution 
with a — 1 and N — ab degrees of freedom. Similarly, the main effect due to 
factor B is tested by the statistic MSg/MSz which, under the null hypothesis 


Two-Way Crossed Classification with Interaction 217 


H’ : B; = 0 for all j, has an F distribution with b — 1 and N — ab degrees of 
freedom. 


General Case of Unequal Frequencies 

If the sample sizes n;;’s do not vary considerably, say, by not more than the 
ratio of 2 to 1, with most n;; being nearly equal and no nj; equal to zero, 
an approximate analysis of variance suggested by Yates (1934), called the 
method of unweighted means, may be used. This approximate method is also 
used in cases where the n;;’s do differ considerably but the researcher de- 
sires to obtain a quick initial approximation to a more exact analysis. Note 
that since Var(¥;;.) = o2/nij, the variances are unequal when nj; is not 
constant. 

The procedure is rather a simple one where an analysis of variance is per- 
formed using the jj,;.’s as if there were only one observation for each (i, j)-th 
cell. The sums of squares for the main effects, interaction, and error are calcu- 
lated in the usual way. Thus, defining xj; = yj;., the expressions for the sums 
of squares are: 


a b 
SS4Bu = POC: — Xj. — xj +x), (4.10.3) 

i=] j=] 
and 

a b Nij 

SSz = > (viik — Vij), 

i=1 j=l k=1 
where 

b a 

Xi, = > xi; /b, xj = Y- xij/a, 

j=! i=] 

and 


oI 
lI 
Me 

Kas 
aa 
™— 

Q 

> 


The analysis of variance is shown in Table 4.6 and the approximate F tests 
are performed based on the usual ratios of mean squares. It should, however, 


218 The Analysis of Variance 


TABLE 4.6 

Unweighted Means Analysis for the 
Unbalanced Fixed Effects Model in (4.1.1) 
with Disproportional Frequencies 


Source of Degrees of Sum of Mean 
Variation Freedom Squares Square 
Due to A a—1 SS Au MS an 
Due to B b-1 SS Bu MS Bu 
Interaction Ax B (a—1)(b-1) SSABu MS aBu 
Error N —ab SSE MSeE 
Total N-1 SS7T 


be noted that the sums of squares SS4,,, SSg,, and SS,4g, are computed on a 
“mean” basis, whereas the SS is computed on an “individual” basis. Thus, 
MSze = SSz/(N — ab) is not the correct term with which to test for the main 
effects and interaction mean squares. It must be modified and expressed on 
a ‘“‘mean’’ basis to be comparable to the mean squares for main effects and 
interaction. 

The expected values of the mean squares are obtained as follows (see, e.g., 
Searle (1971b, pp. 365-—366)): 


b< — — 
E(MS au) = —— ) loi + @B)i, — & — (@B).Y +," 0 
i=] 


b 
E(MSpu) = = 2h; + @B); —B.—G@B).P+njz'o2, (4.104) 


E(MSapu) = 7 = ae Leu (wB);, — (@B).j + (@B).P 
+n,'o2, 
and 
E(MSz) = 07, 
where 


~1 
] a b 4 
ny = (3. } "i . (4.10.5) 


i=l j= 


Note that n;, represents the harmonic mean of all a x b n;;’s. The following 


Two-Way Crossed Classification with Interaction 219 


features of the preceing analysis are worth noting: 


(1) The means of the x;;’s are calculated in the usual manner; that is, x; = 
jet x;;/b, and so on. 

(ii) The error sum of squares SS¢ is calculated exactly as in the case of 
proportional frequencies. 

(111) The sums of squares do not add up to the total sum of squares. The 
first three sums of squares (i. e., SS4,,SSzg,, and SS,48,) add up to 
ran Ya ij.) but all four do not add up to the total sum of 
squares. 

(iv) The sums of squares SS4,, SSg,, and SS4 3, do not have chi-square type 
distributions as in the case of model (4.1.1), nor they are independent. 

(v) The sum of squares SS; is independent of SS,,, SSg,, and SS,4z,, and 
SSe/o2 has an exact chi-square distribution. All other sums of squares 
(divided by a2) have only approximate noncentral (or central under Ho) 
chi-square distributions. 


Since the mean squares in Table 4.6 do not have exact chi-square distributions, 
their ratios do not provide exact F statistics for testing hypotheses of interest. 
However, Gosslee and Lucas (1965) indicated that they provide reasonably 
adequate F statistics using modified degrees of freedom for the numerator 
mean squares. For example, the modified numerator degrees of freedom for 
MSa,,/MSz 1s 


} 


2 
(a—1) (» | 
va = 7 Cae 
(> n) +ala—2))°h; 
i=] i=] 


(4.10.6) 


where 


Similarly, Rankin (1974) has shown that the approximate F tests give satis- 
factory results provided the ratios of sample sizes do not exceed 3. He also 
investigated the problem of modifying the numerator degrees of freedom to ad- 
just for irregularities in sample sizes. Note that although the amended degrees 
of freedom (4.10.6) modify MS,,/MSz_ to be an approximate F statistic, we 
observe from (4.10.4) that the hypothesis it tests is the equality of a; + (@B),. 
for all i. Furthermore, since the observation x;; = y;;, has variance o7/nj;;, the 


220 The Analysis of Variance 


average variance of all the “observations” is 
a b op a b - o? 
—y e[nij)=— > dong = 
ab Nh 


bz J=!1 i=l] j=l 


and the estimated average variance of the “observations” 1s 


where n,, is the harmonic mean of the n;;’s defined in (4.10.5). 

An alternative to the unweighted-means analysis is the weighted-squares-of- 
means analysis also proposed by Yates (1934). In this technique, the interaction 
and error sums of squares are defined as earlier, but SS,4, and SS,, sums of 
squares are weighted in inverse proportion to their variances according to the 
number of observations in the cell. Thus, letting x;; = y;;., the corresponding 
weighted sums of squares are: 


SSaw = > wai(X;, — Xa)? 


and 


where 


b? - i=1 
WaAi b 1 > A a > 
— ) Wai 
j=l Nij i=l 
b 
y wai. j 
az _ j= 
WBj — a ] > XB = b b) 
a > 8) 
ia ij ’ 
= j=l 


Two-Way Crossed Classification with Interaction 221 


TABLE 4.7 

Weighted Means Analysis for the Unbalanced 
Fixed Effects Model in (4.1.1) with 
Disproportional Frequencies 


Source of Degrees of Sum of Mean 
Variation Freedom Squares Square 
Due to A a—1 SSaw MSaw 
Due to B b-1 SSBw MSBw 
Interaction A x B (a — 1)(b— 1) SSABw MS aBw 
Error N —ab SSE MSe 
Total N-—1 SSr 


and x;,,x,;, and x. are defined as earlier in the case of unweighted-means 
analysis. The interaction and error sums of squares are of course given as in 
unweighted-means analysis; that is, 


SSaBw = >> Yi — x, -x%;+x,) 


and 


Q 
= 
at 


SSz = > 2 (vijk — Vij)’ 


i=1 j=l k=! 


The complete analysis of variance is shown in Table 4.7 where expected 
values of the mean squares are obtained as follows (see, e.g., Searle (1971b, 
pp. 369-371)): 


; 2 
ao. 3 wai(a; + (@B);.) 
E(MSaw) = ~T1 > wai | &; + (@B);. — a +0, 
a 2 WAi 
i=l 
, 2 
, >> waj(B; + (aB) ;) 
] — j=l 2 
E(MSzy) = b-1 - WBj B; + (aB).; rs + O,, 
j=l 


_ b 
rz, 
j=l 


222 The Analysis of Variance 


a b 
(aR). — (wR). & (wh) PP 
E(MSasw) = 7a pn D2 2 {By (@B);. — (@B).; + (@B).] 
+n, o2, 
and 
E(MS,) = o?. 


Itcan be shown that the variance ratios MS 4y/MS¢, MSgy/MSze, and MSapw / 
ny MSz= provide exact tests of overall hypotheses concerning a;’s, B;’s, and 
(a@B);;’s. When the data are balanced, the null hypotheses being tested are: 


H¢' :alla; =0, Ho’ :allB; =0, Hp'* :all (@B),; = 0. (4.10.7) 


However, when data are unbalanced, the corresponding null hypotheses being 
tested are: 


Hé : all a; + (@B);, areequal, H2 : all B; + (@B).; are equal, 
Hg! :all (wB)i; — (@B);. — (@B).j + (@B).. are equal. (4.10.8) 


Thus, only with the restrictions 
(vB); =0, i=1,2,...,a; (@B);=0, j =1,2,...,b 


the null hypotheses (4.10.7) and (4.10.8) are equivalent. 

Federer and Zelen (1966) present another approximate analysis that is more 
exact, but somewhat more complicated than the unweighted and weighted anal- 
yses discussed here. Still, another approximate method called the method of 
expected subclass numbers can be found in Bancroft (1968, pp. 37— 41). In situ- 
ations when the approximate methods are not applicable, for example, badly 
balanced designs (with 10 or more observations in some cells and only a few 
in others) and designs with empty cells, a method based on multiple regression 
analysis may be used. The method consists of considering the analysis of vari- 
ance model as a regression model, fitting the model for the data, obtaining sums 
of squares for main effects and interactions as the regression sums of squares, 
and using the general inferential techniques for the regression model. There are 
various methods for carrying out this analysis and different methods may lead to 
differentresults. For example, one has to determine whether one will test the SS, 
adjusted for {(@B);;} and {B;} or only for {6 ;} if {(@B),;’s} are not significant, or 
test the unadjusted SS, against SS. Furthermore, the sums of squares are no 
longer orthogonal and the sequence in which hypotheses involving fixed effects 
{a;}, {B;}, {(@B);;} are tested may lead to different results. In addition, a unique 
partition of sums of squares does not exist and the hypotheses being tested do 


Two-Way Crossed Classification with Interaction 223 


not always correspond to the case of balanced design. For additional details re- 
garding this approach see Draper and Smith (1981) and Searle (1971b, 1987). 


RANDOM EFFECTS ANALYSIS 


The problems of testing hypotheses and estimation of variance components en- 
countered in unbalanced designs of random effects models having two or more 
factors are much more complicated than the corresponding balanced case. We 
again consider two cases for the random effects analysis, one for the case of pro- 
portional frequencies and the other for the general case of unequal frequencies. 


Proportional Frequencies 

For the case of proportional frequencies, the expected values of mean squares 
can be obtained by the Wilk and Kempthorne (1955) formula. For example, 
letting nj; = (nj.n_;)/N, we obtain (see, e.g., Snedecor and Cochran (1967, pp. 
478—483)): 


N ane 
E(MS,) = 02 + (:- “) 


a-—l1 


E(MSag) = 02 + —V_ ] >. (1 - 
SAB) = Fe + Ded 4 x - 5) 


and 
E(MSz) = o?2. 


Approximate tests of hypotheses and variance components estimates can be 
constructed as earlier. For detailed discussion and numerical examples, see 
Bancroft (1968, Section 1.6). 


General Case of Unequal Frequencies 
For the general case of unequal frequencies, Hirotsu (1968) proposed approxi- 
mate F tests for testing the hypotheses: 


a a 


Hs -o7 =O versus H} -o7 > 0, 
Hy :O% =(Q versus Hy? Of > 0, (4.10.9) 


224 The Analysis of Variance 


and 
AB. 2 _ AB, 2 
Hy” : 04g, =9 versus Hy" soy, > 0, 


by using the test statistics analogous to those in the balanced case where now the 
mean squares are those obtained in the unweighted-means analysis discussed 
earlier in this section. Thus, the proposed test statistics are: 


MSau/MSazsu, for He; 
MSzu/MSazgu, for Hy’; (4.10.10) 


and 
MSagu/n,' MSe, for Ho”; 


where 


, =| 
l1< 4 
m= (3 i | 


i=l j=1 


The test statistics (4.10.10) are to be compared with the 100(1 —@)th percentage 
points of the F distribution with the degrees of freedom [(a— 1), (a—1)(b—1)], 
[(b — 1), (a — 1)(b — 1)], and [(a — 1)(b — 1), N — ab], respectively. 


Remark: Hirotsu (1968) gave the expressions for the power functions of the tests 
(4.10.10) with numerical examples, which, however, tend to be very complex. Spjotvoll 
(1968) and Thomsen (1975) proposed exact tests for main effects variance components 
under the assumption that the interaction variance component is zero. Khuri and Littel 
(1987) developed exact tests of variance components that do not require the assump- 
tion of nonexistence of interaction variance component. Hussein and Milliken (1978a) 
considered tests for main effects variance components in a heteroscedastic situation 
assuming that the interaction variance component is zero. Similarly, Tan et al. (1988) 
reported tests for main effects as well as interaction variance components involving a 
heteroscedastic model. 


For the estimation of variance components, three methods of estimation 
were initially proposed and studied in some detail by Henderson (1953). The 
methods were reexamined and represented in elegant matrix notations by Searle 
(1968). Since then a variety of new procedures have been developed and the 
theory has been extended in a number of different directions. Rao (1971, 
1972) introduced the concept of minimum norm quadratic unbiased estima- 
tion (MINQUE). Similarly, LaMotte (1973) considered minimum variance 
quadratic unbiased estimators (MIVQUE) and Pukelsheim (1981) investigated 
the existence of nonnegative unbiased estimators. For detailed discussions of 


Two-Way Crossed Classification with Interaction 225 


these and other developments in the field the reader is referred to Searle et al. 
(1992) and Rao (1997). 

To illustrate the nature of the problem of estimating variance components for 
the case of unbalanced cell frequencies, consider an experiment for which the 
following two-way additive model is appropriate: 


U = 1, 2, ,a 
Yijk = M+; + Bj + eijR J=1,2,...,b (4.10.11) 
k =0, 1, »Nij, 


where the a;’s, B;’s, and e;;,’s are uncorrelated random variables with mean 
zero and variances of, 03, and 0, respectively. Let the total sum of squares be 
partitioned as follows: 


a 


b Nj 2 y? b y?. y? 
SY om 3 = [zee] 4 fz | 
i=] j=! Nj. 


or 
SS7r = SS, +882 +SSe, 
where 
SSe = SSr — SS, — SSz. 
Note that it is possible that SS~ may be negative. The derivation of expected 
values of the mean squares is complicated and the results may be shown to be 


those given in Table 4.8 (see, e.g., Graybill (1961, pp. 360—362)), where the 
coefficients of variance components are determined as follows: 


and 


Wrasbri i=l imi jai "J 


226 The Analysis of Variance 


TABLE 4.8 

Analysis of Variance for Model (4.10.11) 

Source of Degrees of Sum of Mean Expected 
Variation Freedom Squares Square Mean Square 
Due to A a—1 SSA MSA of + cabog + CaaFq 
Due to B b-1 SSB MSp oa? + ChbOp + Chad? 
Error N-a-—b+1 SSe MSe 07 +: Cepog + Cea%y 


If one desires to estimate variance components by the analysis of variance 
method, that is, by equating mean squares to their corresponding expected 
values, one obtains the following system of equations: 


MS4 = 62 + cap6§ + CaaS 5 
MSg = 62 + cypo5 + Coad, (4.10.12) 


a2 a2 a2 
MSg = 672 + CeoO'g + CeaT gy: 


where parameters have been replaced by their respective estimators. The resul- 
tant solution of the system of equations (4.10.12) provides a set of estimators 
of the variance components. The evaluation of explicit expressions for the esti- 
mators is somewhat involved and the results can be found in Searle (1971b, 
p. 487) and Searle et al. (1992, p. 439). The estimators obtained will be un- 
biased and consistent, but other optimum properties are still being explored. 
Although sampling variances can be obtained, other distribution properties can- 
not, since even under the normality assumptions, distribution of the estimators 
is unknown. 

The only functions of variance components for which exact intervals can be 
obtained are o? and o7,/0;. For a discussion of the problem of setting confi- 
dence intervals for the individual variance components, certain ratios of variance 
components and proportions of variability, including numerical examples, see 
Burdick and Graybill (1992, pp. 136-145). 


MIXED EFFECTS ANALYSIS 


Most of the inferential difficulties that are encountered occur in the mixed effects 
model. The treatment of the unbalanced mixed model is beyond the scope of 
this volume. The interested reader is referred to Searle (1971b, pp. 429-431; 
1987, Chapter 13; 1988), Stroup (1989), McLean et al. (1991), Hocking (1993), 
and Khuri et al. (1998). Smith (1951) discusses the tests of hypotheses for the 
mixed model with proportional frequencies. For a discussion of exact tests for 
the random and fixed effects in an unbalanced two-way crossed classification 
model, see Gallo and Khuri (1990). Burdick and Graybill (1992, p. 172) givea 


Two-Way Crossed Classification with Interaction 227 


numerical example illustrating the computation of an exact interval for O58 /o2 
and an approximate interval for Op. 


4.11 POWER OF THE ANALYSIS OF VARIANCE F TESTS 


The power of the analysis of variance F tests for AB interactions, factor B 
effects, and factor A effects can be evaluated in a manner similar to the case of 
one-way classification. The results on power calculations are briefly summa- 
rized in the following. 


MODEL I (FIXED EFFECTS) 


The parameter ¢ and the appropriate degrees of freedom for each of the tests 
are as follows. 


Test for AB Interactions 


Power = P{F’[v, 236] > Flv, 231 —a)}, 


where 
vy} =(a—1)\(b-—1), w=ab(n —-1), 
and 
ab 
|” > dab); 


oe\\ (a — 1(b— 1) +1 
Test for Factor B Effects 
Power = P{F'[v,, 236] > Flv, 231 —a)}, 
where 
vy, =b-1, w=ab(n-1), 


and 


Test for Factor A Effects 
Power = P{F'[v), 236] > Flv, v2;1— a}, 
where 


vy} =a-1, w=ab(n-1), 


228 The Analysis of Variance 


and 


Remark: Kastenbaum et al. (1970b) gave tables showing how large n must be (1 < 
n < 5) fora = 2(1)6and b = 2(1)5 in testing for factor A effects witha = 0.05, 0.01, 
and 1 — B = 0.7, 0.8, 0.9, 0.95, 0.99, 0.995, when max |a; — a;|/o, is given. More 
extensive tables are given by Bowman (1972) and Bowman and Kastenbaum (1975). 


MODEL II (RANDOM EFFECTS) 


The power calculations involve only the central F distribution. The results for 
each of the tests are as follows. 


Test forAB Interactions 


Power = P{F[v, v2] > A? F[vy, v2; 1 — a}, 


where 
vy) =(a—1)(6-1), w=ab(n—- 1), 
and 
no 
A=,fJ1+—$ 
oO 


Test for Factor B Effects 
Power = P{F[vj, v2] > 17? Flv, v2;1—a}}, 
where 
vy, =b-1, w=(a—-1\(b-1), 


and 


Test for Factor A Effects 
Power = P{F[v1, v2] > ,-? Flv}, %;1—a]}, 
where 


y=a-1l, w=(a—1)6—)), 


Two-Way Crossed Classification with Interaction 229 


r= he bno? 
7 o2+ NO sg 


The power of the tests of the hypotheses of the type oj,/0; < pi,0g/(a; + 


2 2 2 
now) < P2; or On /(o; 


central F distribution. 


and 


+ NOjn) < 3 can similarly be expressed in terms of the 


MODEL III (MIXED EFFECTS) 


The power of the tests for AB interactions and B effects involves central F 
distributions and for A effects involves the noncentral F distribution. The results 
are as follows. 


Test for AB Interactions 


Power = P{F[v, 2] > A? Fly, v2; 1 —a@)}, 


where 
vy) =(a—1)(b—-1), Ww =ab(tn— 1), 
and 
2 
no 
A=,f/1+— 
Oo 


Test for Factor B Effects 


Power = P{F[vj, v2] > n-? F[v,, v2.31 —a}}, 


where 
vy, =b-1, w=ab(in-1), 
and 
2 
ano 
A=,f/1+— 
Oo 


Test for Factor A Effects 
Power = P{F'[v, v2.36] > F[4, v2; 1 —e@]}, 
where 


y=a-l, w=(a—-I1)b- 1), 


230 The Analysis of Variance 


and 


The power of the tests of the hypotheses of the type oj,/07 < p: oroz/o; < po 
can similarly be expressed in terms of the central F distribution. 


4.12 MULTIPLE COMPARISON METHODS 


Usually, more than one comparison is of interest and the multiple comparison 
procedures discussed in Section 2.19 can be employed with only minor modifi- 
cations. The procedures can be utilized for the fixed as well as the mixed effects 
models. Most comparisons concern the control of the error rate a for each sep- 
arate family of F tests, that is, the two main effects and the cell means. One 
may want to control a@ for the entire experiment comprising all three families 
of tests but it is rarely of interest. 

For example, under Model I, if HH? is rejected, we would be interested in 
comparing the cell means y;; = “ + a; + Bj + (@B);;. Then, the Tukey’s or 
Scheffé’s method may be used to investigate the contrasts of the type 


L= wij — Mij', 


among all cell means, where L is estimated by 


L = yij. — yiry... 
Now, the procedure is equivalent to the one-way classification model with the 
total number of treatments here being equal tor = ab and the degrees of freedom 
for MS¢g equal to ab(n — 1). 
. Thus, suppose that y;,;, is larger than yj; ;. Then using the Tukey’s procedure 
L is significantly different from zero with confidence coefficient 1 — @ if 


A 


L 
V¥MSze/n 


If the Scheffé’s method is applied to these comparisons, then L is significantly 
different from zero with confidence coefficient 1 — a if 


> glab, ab(n — 1);1 — a]. 


A 


L 
J (ab — 1)MSz(2/n) 


If H;' is not rejected, we usually would proceed to test Hj! and Hj. If H¢' or 
H@ is rejected, the Tukey’s or Scheffé’s method may be used to study contrasts 


> {Flab — 1, ab(n — 1);1 — a}}!”. 


Two-Way Crossed Classification with Interaction 231 


among the a;’s or B;’s of the form 
or 


where L and L’ are estimated unbiasedly by 


a 
L= \~ Li Yj... 
i=l 


and 


—— > gla, ab(n — 1); 1 — @]. 
1 Qa 
(bn)! mse; d «i 


Similarly, if the Scheffé’s method is used, L is significant at the w-level if 


t > {Fla — 1, ab(n — 1);1 —a@)}'””. 


(a — 1)(bn)“! Mse( a) 
[=] 


I 


Likewise, the significance of the contrast L’ can be tested. 
If one wishes to construct intervals for L, then using the Tukey’s method a 
100(1 — @) percent simultaneous confidence interval for L is given by 


a 


. 1 . 1 
iL —T | (bn) iwse(5 Soil) <e<L-47 (bn) 'wse(5 9-14). 


. 


i=] 


i=l 


(4.12.1) 
where 


T = qla, ab(n — 1); 1 — a]. 


232 The Analysis of Variance 
In particular, for a pairwise contrast, the Tukey’s interval is 


yi. — Ww. — TV (bn)! MSe < a — a < J. — Yv.. + TV (bn)! MSe. 


Using the Scheffé’s method, the interval will be 


| 


<L<L+S |(a—1)(bn)"! Mse( 


ya 


i=] 


L—S |(a—1)(bn)! Mse( 


ye 


i=] 


) (4.12.2) 


where 
S* = F[a — 1, ab(n — 1);1 — a]. 


The Bonferroni-type confidence interval based on the ¢ distribution is obtained 
as 


L —t[lab(n — 1), 1 —a@/2m] 


(bn)-! MSz EA 
i=] 


<L <L+t{ab(n—1),1—a/2m) | (bn) MSe > €;, 


i=] 


where m is the number of intervals made, with an overall level of at least 1 — a. 
Similar confidence intervals can be given for L’. 

When a design is slightly unbalanced and one uses the unweighted-means 
analysis, then the foregoing Tukey’s procedure can be used by replacing n by 
Np given by (4.10.5). The coverage probability should be approximately 1 — a; 
however, as the design becomes more imbalanced, the coverage probability 
deteriorates. 

Under Model III, the contrasts of interest involve only «;’s and if Hg* is 
rejected, the Tukey’s or Scheffé’s method can be employed to investigate con- 
trasts of the type )>;_, £;a;. For example, suppose we wish to obtain all pairwise 
comparisons between a;’s by means of Tukey’s method. Then 


L = aj — qj, 


Two-Way Crossed Classification with Interaction 233 


and 
Var(£) = 2MS,ag/bn. 
The value of T in this case will be 
T = qla, (a — 1)(6 — 1);1— ae], 
leading to the interval 


Vi. — Wr.. — TV (bn)! MSap < oj — ay < ¥j,. — ¥ir. + TV (bn)! MSazs. 


(4.12.3) 
For the Scheffé’s interval, we will have 
L — S$ |(a—1)(bn)-! MS ap (>> a) 
i=] 
<L<L+S |(a—1)(bn)-!MSap (> a) (4.12.4) 
i=] 


where 
S? = Fla —1,(a—1)(b — 1);1 — a]. 


For just a single confidence interval, one can use J2t[(a — 1)\(b — 1);1 —a@/2] 
in place of T. For a limited number of comparisons k, the Bonferroni intervals 
are obtained by using J2t[(a — 1)(b — 1); 1 — a@/2k] instead of T. Fork < 
a(a — 1)/2, the Bonferroni intervals are usually shorter than the Tukey intervals. 


4.13 WORKED EXAMPLE FOR MODEL | 


Steel and Torrie (1980, pp. 217-218) reported data (courtesy of A.C. Linnerud, 
North Carolina State University) on times (in seconds) to complete a 1.5-mile 
course. All the runners were men classified in three age groups and in three 
fitness categories. The data form a two-way classification and are shown in 
Table 4.9. The example just described can be regarded as a two-way fixed 
effects model since three age groups and 3 fitness categories are specially 
chosen by the researcher to be of particular interest and thus both factors will 
have systematic effects. Since there are two observations for each combination 
level, this will enable the experimenter to evaluate for the presence of interaction 
effects. If there were just one observation for each combination level, either lack 
of interaction would have to be assumed or its presence would be confounded 
with the error term. It could not be estimated separately. 


234 The Analysis of Variance 


TABLE 4.9 
Running Time (in seconds) to 
Complete a 1.5 Mile Course 


Fitness Category 
Age Group Low Medium —_ High 


40 669 602 527 
671 603 547 

50 775 684 571 
821 687 573 

60 1,009 824 688 
1,060 828 713 


Source: Steel and Torrie (1980, p. 218). Used 
with permission. 


The mathematical model for this experiment would be 


l 
Vijk = M+ + Bi + (AB); +e I= 
k 


2,3 
2,3 
2 


>] 


“se 


where yp is the general mean, a; is the effect of the i-th age group (-7 a; = 0), 

6; is the effect of the j-th fitness category (5-1 B; = 0), (@B);; is the 

fixed effect interaction of the i-th age group with the j-th fitness category 

(S_, (wB);; =O0= Viet (wB);;), and e;;,’s are experimental errors assumed to 

be independently and normally distributed each with mean zero and variance a2. 
The following computations will lead to the analysis of variance table. 


(i) The cell totals: 


yu, = 1,340, yyo, = 1,205, yy3. = 1,074; 
y21, = 1,596, yoo = 1,371, yo3, = 1,144; 
y31, = 2,069, y32, = 1,652, 33. = 1,401. 


(11) The row (age) totals: 

yy). = 3,619, yo. =4,111, ys, = 5,122. 
(iii) The column (fitness) totals: 

yi. =5,005, yo, = 4,228, y3, = 3,619. 
(iv) The grand total: 


y,. = 3,619 + 4,111 +5,122 = 12,852. 


Two-Way Crossed Classification with Interaction 235 


(v) 
3 2 
So y24 = (669 + (671)? +--+» + (713)? = 9,557,568. 
i=1 j=l k=1 
(vi) 
2 12,852)" 
3x 3 x 2 13 
(vil) 
I s > _ 3,619)? + 4111? +5122 _ 9 aa ge 
3x2 6 = 9, , ; 
(viii) 
l 3 > _ (5,005)? + (4,228 + (3,619 a4 106 
3 x 2? jal yj. ~~ 6 —_ 9 9 . 
(ix) 


i 1,340)? + (1,205)? +... + (1,401)? 
> > y= aaa — 9,554,680. 


(x) SS- = 9,557,568 — 9,176,328 = 381,240. 
(xi) SSrc = 9,554,680 — 9,176,328 = 378,352. 
(xil) SS, = 9,372,061 — 9,176,328 = 195,733. 
(xii) SSg = 9,337,195 — 9,176,328 = 160,867. 
(xiv) SSag = 378,352 — 195,733 — 160,867 = 21,752. 
(xv) SSe = 381,240 — 378,352 = 2,888. 


These results, along with the remaining analysis of variance computations, 
are summarized in Table 4.10. If we choose the level of significance a = 0.05, 
we find from Appendix Table V that 


F[2, 9;0.95] = 4.26, 
and 


F[4, 9; 0.95] = 3.63. 


Comparing these values with the computed F values given in Table 4.10, we 
may reach the following conclusions: 


(a) Reject the hypothesis of no interaction effects and conclude that there 
is strong evidence of interaction between the different age groups and 
the different fitness categories (p < 0.001). 


236 The Analysis of Variance 


TABLE 4.10 

Analysis of Variance for the Running Time Data of Table 4.9 

Source of Degreesof Sum of Mean Expected 

Variation Freedom Squares Square Mean Square FValue p-Value 

3x22 4 

Age group 2 195,733 97,866.500 a2 + a ya; 304.99 <0.001 
| 

Fitness 3x22 

category 2 160,867  80,433.500  o2 + = \> 67 =. 250.66 = <0.001 
=14& 
2 
Interaction 4 21,752 5,438.000 a? 


“" G—DG-D 
3. 3 
x >> (aep)?, 16.95  <0.001 


i=l j=l 
Error 9 2,888 320.889 oa? 


Total 17 381,240 22,425.882 


(b) Reject the hypothesis of no age effects and conclude that different 
age groups result in different mean running time to complete the race 
(p < 0.001). 

(c) Reject the hypothesis of no fitness effects and conclude that the mean 
running times to complete the race are not the same for the three cate- 
gories (p < 0.001). 


It should be noted that the presence of interaction between age group and 
fitness category seems to be more than just achance occurrence. The presence of 
interactions makes the interpretation of the main effects more difficult. Although 
F tests still remain valid, the hypotheses about the main effects cannot be 
interpreted only in terms of the @;’s and the B;’s. Nevertheless, assuming that the 
interaction effects are unimportant, we attempt to illustrate the use of orthogonal 
contrasts to partition the sums of squares for age group and fitness category and 
perform tests on a contrast. 

If the hypothesis of no interaction were true, we could make general compar- 
isons regarding the fitness rather than separate comparisons for each age group. 
Similarly, we might make general comparisons among the age groups rather 
than separate comparisons for each fitness. For example, we could compare 
fitness category low versus high and also low and high versus medium. The 
contrasts for making these comparisons would be 


L, = B, — B 


and 


Lz = Bi + B3 — 2p, 


Two-Way Crossed Classification with Interaction 237 


respectively. The single degree of freedom sums of squares associated with L 
and L» are obtained as follows 


(5,005 — 3,619)" 
SS,, = ————_ = 160,083 
"612 + (-1)2] 


and 


2 
ss, — D002 + 3,019 = 24,228)" _ ogy 
: 6[(1)? + (1)? + (-2)7] 

Notice thatSS;, +SS 7, = SSz since L, and L2 are two independent orthog- 
onal contrasts which partition the sum of squares for the fitness into two single 
degree of freedom sums of squares. The computed F values corresponding to 
L, and Lz are, respectively, 


160,083 
320.889 


= 498.87 


] = 
and 


7840 | 
320.889 


fx, = 


Comparing these values to the critical value F[1, 9;0.95] = 5.12, we find that 
F, is highly significant (p < 0.001) but F> falls well below the significance 
level (p = 0.15). Thus, the results of the F tests indicate that the hypothesis 
Ho: B; — £3 = O is rejected whereas the hypothesis Ho: B; + Bs — 2A is 
sustained. 

Similarly, we could compare age groups 1 and 3 and also age groups 1 
and 3 versus 2. The resulting F ratios, each with 1 and 9 degrees of freedom 
are both highly significant (p < 0.001). The results indicate that the low age 
group requires the least running time whereas the upper age group requires the 
most running time. However, as indicated previously, the researcher should be 
cautious in making any general conclusions because of the strong interaction 
effects between the factors. The running time within each age group varies 
greatly according to the fitness category. Although, in each age group, the 
running time decreases dramatically as we move from the low to the high 
fitness category, the decrease is much greater for the upper age group than for 
the low and the middle age group. 


4.14 WORKED EXAMPLE FOR MODEL I: UNEQUAL SAMPLE 
SIZES PER CELL 


The following example is based on an unbalanced design described in Blackwell 
et al. (1991, pp. 286-287). The original data came from a balanced factorial 


238 The Analysis of Variance 


TABLE 4.11 
Weight Gains (in grams) of Rats under Different Diets (Data Made 
Unbalanced by Deleting Observations) 


Quantity of Protein 


Sourceof 
Protein High Low 

Beef 81, 100, 102, 104, 107, 111,117,118 51, 64, 72, 76, 78, 86, 95 

Pork 79, 91, 94, 96, 98, 102, 102, 108 49, 70, 73, 81, 82, 82, 86, 97, 106 


Cereal 56, 74, 77, 82, 86, 88, 92, 95,98, 111 58, 67, 74, 74, 80, 89, 95, 97, 98, 107 


Source: Blackwell et al. (1991, p. 287). Used with permission. 


experiment reported by Snedecor and Cochran (1989, p. 304) to test the effec- 
tiveness of two factors, source of protein: three levels — beef, pork and cereal, 
and quantity of protein: two levels — high and low, forming six potential protein 
feeding treatments. Ten male rats were randomly assigned to each treatment and 
gains in weight were recorded. The experimental data were made unbalanced 
by deleting eight observations and the remaining observations are given in 
Table 4.11. 

In the following, we illustrate both unweighted- and weighted-squares of 
means analysis. In this type of analysis, two models are used. The model for 
the mean of each subclass 1s 


_ _ l 
Xij = Vij. = M+ a; + Bj + (WB); + ij. | j 


where yj is the general mean, a; is the effect of the i-th source of protein 
(0, a; = 0), B; is the effect of the j-th quantity of protein (S51 6B; = 9), 
(wB);; is the fixed effect interaction of the 7-th protein source with the j-th 
protein quantity (3-2, (@B)ij =0= y-1(@B)ij), and é;;, = Yt €ijx/Nijk 
is the experimental error associated with the (i, j)-th cell which 1s assumed to 
be independently and normally distributed with mean zero and variance o?. 
The preceding equation represents the model being assumed when the sums of 
squares for factors A and B and the interaction AB are computed. For computing 
the error sum of squares, the model being used is 


Vijk = W+a; + Bj + (AB); + efx 


: 3 2 3 2 
where, again, 1) % = i=l 6; = 2 i=1(08)ij = yj =1(@B)ij = 0 and Cijk 
is the experimental error assumed to be independently and normally distributed 
with mean zero and variance o2. The expectation of error mean square is o2/np, 


Two-Way Crossed Classification with Interaction 239 


which has to be divided by 1/n, so that it yields an unbiased estimate of 


2 
O;. 


The computations proceed as follows: 


(i) The cell counts (n;;): 


nyj=8, nyy=7, 
ny=8, ny =9Y, 
n3,) = 10, n32 = 10. 


(ii) The cell means (x;;): 


X1j1 = 105.0000, X12 = 74.5714, 
X21, = 96.2500, X22 = 80.6667, 
x3; = 85.9000, =—_-x32 = 83.9000. 


(iii) The row (protein source) means: 
X;, = 89.7857, Xo, = 88.4584, x3, = 84.9000. 
(iv) The column (protein quantity) means: 
X; =95.7167, X2 = 79.7127. 


(v) The grand mean: 


x. = 87.7147. 
(vi) 
Y > x7, = (105.0000) + (74.5714)? + --- 
i=l j=l 

) + (83.9000)? = 46,775.0927. 
(v1) 

3 x 2x? = 6(87.7147)? = 46,163.2116. 
(vill) 


3 
2) © x? = 2[(89.7857)° + (88.4584)" + (84.9000)”] = 46,188.7409. 
i=1 


240 The Analysis of Variance 


N 


3) © x7, = 3[(95.7167)? + (79.7127)"] = 46,547.4036. 


Now, the unweighted sums of squares for factor A (protein source) and factor 
B (protein quantity) are: 


3 
Suu = 296-1) =2) > x — 3 x 2%? 
i=] 


= 46,188.7409 — 46,163.2116 = 25.5293 
and 


2 2 
SSau = 3) (8j — 8.) = 3) #3 -3 x 257 
j=] j=1 


j 
= 46,547.4036 — 46, 163.2116 = 384.1920. 


To calculate the corresponding weighted sums of squares, we have 


b2 (2)° 
Ni = Ny2 8 
b2 (2)? 
n\ ny 8 9 
b2 (2)° 
bas TTT 1s 20.0000, 
nN3\ N39 10 10 
= a” __ 8 95 7443 
a a a as 
ny nr n3) 8 8 10 
2 5s 960 
WB. = FT l ~~ 1 1 1 7e% 
— + — + -+-+-— 


Two-Way Crossed Classification with Interaction 241 


3 
y W AiXi. 
_ i=l 


XA = — 
> WAi 
i=l 
_ (14.9333)(89.7857) + (16.9412)(88.4584) + (20.0000)(84.90000) 
7 14.9333 + 16.9412 + 20.0000 
— 87.4686, 
and 
3 
> WBjX.j 
- 4 (25.7143)(95.7167) + (25.4260)(79.7127) 
XB = -—-—-: > s ——————_—_———_—_———_”X”X”mnkXn—_ ss cl 


25.7143 + 25.4260 


Therefore, the corresponding weighted sums of squares are: 


3 
SSaw = > wai(X;, — X4)" 
i=l 


= 14,.9333(89.7857 — 87.4686)" + 16.9412(88.4584 — 87.4686)" 
+ 20.0000(84.9000 — 87.4686)” 


= 228.7277 
and 


2 
SSpw = \- W pj (X.; — Xp) 
j=l 
= 25.7143(95.7167 — 87.7598)? + 25.4260(79.7127 — 87.7598) 
= 3274.5118. 


The interaction and error sum of squares for both unweighted and weighted 
analyses are: 


242 


and 


The Analysis of Variance 


= 46,775.0927 — 46, 188.7409 — 46,547.4036 + 46, 163.2116 
= 202.1598 


>> Oi Vij)? 


_ 413.088 — 403,983.0043 
= 9,104.9957. 


Finally, the total sum of squares is obtained as 


3 2 Ni; 
SSr = ¥- >) >On —y ) 

i=1 j=l k=1 
3 2 Ni; 

=D | ijk — ve 
i=1 j=l k=1 

= 413, — (4,556)? /52 

= 13,912.3077. 


These results along with the remaining analysis of variance computations 
are summarized in Table 4.12. Note that for both weighted and unweighted 
analyses, the sums of squares do not add up to the total sum of squares. If we 
choose the level of significance w = 0.05, we find from Appendix Table V that 


and 


F[2, 46, 0.95] = 3.20 


F[1, 46, 0.95] = 4.05. 


Comparing these values with the computed F values given in Table 4.12 for both 
unweighted and weighted analyses, we may reach the following conclusions 


(a) Reject the hypothesis of no interaction effects and conclude that there 


is some evidence of interaction between protein source and protein 
quantity. 


(b) Do not reject the hypothesis of no protein source effects and conclude 


that there is a lack of evidence that mean weights (population marginal 
means) for three sources of protein do not differ significantly. 


243 


Two-Way Crossed Classification with Interaction 


TETS'S = (COI/1 + O1/1 + 6/1 +8/1 
+ 1/1 +8/1)/9 Sq uaats s,/u Jo ues stuowey ay) Aq 1 SuIpratp Aq payrpour st arenbs eau 1019 ay) ‘sonyeA 47 JO UOTWeINdUIOD OUI UL yx 


‘Owes OY} Apeau ale SISA[euL PayYySlamunN dy) JO} (9'‘O[p) B[NWAOJ SuIsN WOpsadJ Jo SdeIZap popuswie sy, 


LLOE 716 EI LLOE7Z16'€1 IS [eoL 

Lvt6 L6l Lvte6 L6l LS66'P01'6 LS66'P01'6 OV Jouq 

6100 610°0 4S V xaS0D 66L0 101 6620 101 86S1° COT 86S1 COC (4 uonoe19jU] 
Ayyuenb 

100°0> 100°0> pS'9l x4 VSO] 8IIS PL7eE OC6I P8E 8IIS PL7''e OC6I 'P8E «l UI9}O1d 
201nos 

y9S'0 18S°0 8S°0 +950 6c9e VII 9P9L Tl LLCL'8@C €6cS'Sc #C Ul9}01d 


Ppa}ysiai payysiamuy, = payysianh = ayysiomur, = payySiaqy =—s payysiamuy~, — payysian payysiamuy) wopaa.{ UOIJELILA 
anjea-d anjea 4 aienbs uray saaenbs jo wins JO saa48aq_—_ 40 a4nos 

LL" ]qey Jo (swess ul) sureD 

}Y319M UO 2}LQ PadueR]equy) ay} jo siskjeuy SUBaW JO saseNbs pajYysiaj, pue payYySiamuy) SuUIS/) BDUeIIeA Jo sIskjeUY 
Cl’'v ATaVL 


244 The Analysis of Variance 


(c) Reject the hypothesis of no protein quantity effects and conclude that 
there is strong evidence that mean weights for two quantities (marginal 
means for high and low levels) of protein differ significantly. 


It should be noted that the presence of interaction between protein source and 
quantity seems to be more than just a chance occurrence. The presence of 
interactions makes the statements about main effects somewhat difficult to 
interpret. Although the F tests are still valid, the hypotheses about the main 
effects cannot be interpreted only in terms of the a@;’s and the B;’s. 


4.15 WORKED EXAMPLE FOR MODEL Il 


Burdick and Graybill (1992, pp. 11-12) described a quality control experiment 
designed to study the sources of variability in the length of window screens. It 
is desired to determine the contribution of the variability in the final product 
that is due to operators, machines, and the operator x machine interaction. 
Three operators and four machines are randomly selected from the available 
operators and machines in the company and each operator makes two screens 
on each of the selected machines. The data collected in the experiment are 
given in Table 4.13. This is an example of a two-way crossed classification 
with replication. Here, our two factors are operators and machines and the 
experimental units that provide the replication are the machine-operator duos. 
Inasmuch as both factors are randomly selected from a rather large population, 
the data should be analyzed using Model II. Furthermore, since there are two 
observations for each combination of operator and machine, this will enable 
the experimenter to test for the presence of any interaction. 
The mathematical model for this experiment would be 


i=1,2,3 
Vijzk = Uta; + Bj + (@B)ij + ee ¥J=1,2,3,4 
k=1,2, 


where jz is the general mean, a; is the effect of the i-th operator, 6; 1s the 
effect of the j-th machine, (w);; is the interaction of the i-th operator with 
the j-th machine, and e ;;,’s are experimental errors. It is further assumed that 
a; ~ N(0,02), Bj ~ NCO, OR)» (@B);; ~ N(O, oJ); and that the ;’s, B;’s, 
(aB);;’s, and e;;,’s are mutually and completely independent. 

The following computations lead to the analysis of variance table. 


(i) The cell totals: 


V1. = 71.5, y12, = 72.8, Y13, = 70.3, Vi4 = 72.0, 
y21. = 71.5, 22. = 71.5, 23, = 72.9, v4, = 69.8, 
y31, = 70.7, yao, = 72.1, y33, = 72.0,  y34, = 72.4. 


Two-Way Crossed Classification with Interaction 245 


TABLE 4.13 
Screen Lengths (in inches) from 
a Quality Control Experiment 


Machine 
Operator 1 2 3 4 
1 36.3 36.7 35.1 35.2 
35.2 36.1 35.2 36.8 
2 35.2 35.33 368 34.9 
36.3 362 36.1 34.9 
3 35.8 36.0 35.9 36.3 


349 36.1 36.1 36.1 


Source: Burdick and Graybill (1992, p. 118). 
Used with permission. 


(ii) The row totals: 
yy... = 286.6, yo. = 285.7, y3.. = 287.2. 
(11) The column totals: 
ya, =213.7, yo =216.4, y3 =215.2, y4 = 2142. 


(iv) The grand total: 


y. = 859.5. 
(v) 
3 4 2 
SLY v2, = G63)? + (35.2)? +--+ 36.1)? = 30,789.67. 
i=1 j=1 k=1 
(vi) 
2 2 
y (859.5) 
= = = 30,780.8438. 
3x4x2 74 0, 780.8438 
(vil) 


1 3 » _ (286.6) + (285.7)? + (287.2) 


y? ; = 30,780.9863. 


246 The Analysis of Variance 


TABLE 4.14 
Analysis of Variance for the Screen Lengths Data of Table 4.13 
Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square FValue p-Value 
Operator 2 0.1425 0.071 o27+ 2008 +4x202 0.101 0.905 
Machine 3 0.7112 0.237) of + 2008 +3 x 20% 0.339 0.798 
Interaction 6 4.1975 0.700 of + 207, 2.222 = 0.113 
Error 12 3.7750 0.315 o; 
Total 23 8.8262 

(vill) 


1 y ye = (213.7)? + (216.4)? + --- + (214.2) 
7 a 


= 30,781.5550. 
3 x 2 4 6 
(ix) 
Ia, (71.5)? + (72.8 +--+ +(72.4/ 


(x) SSr = 30,789.67 — 30,780.8438 = 8.8262. 

(xi) SSrc = 30,785.8950 — 30,780.8438 = 5.0512. 
(xii) SS, = 30,780.9863 — 30,780.8438 = 0.1425. 
(xiii) SSg = 30,781.5550 — 30,780.8438 = 0.7112. 
(xiv) SSag = 5.0512 — 0.1425 — 0.7112 = 4.1975. 
(xv) SSp = 8.8262 — 5.0512 = 3.7750. 


These results along with the remaining computations are summarized in Table 
4.14. 

We can test the hypotheses of interest using the results shown in Table 4.14. 
The presence of the interaction is tested by comparing the ratio 2.222 with 
the theoretical F distribution with (6, 12) degrees of freedom which is not 
significant (p = 0.113). Hence, there is no evidence of the existence of any 
interaction effects. The existence of a main effect due to operators is tested 
by comparing the ratio 0.101 with the theoretical F distribution with (2, 6) 
degrees of freedom which is also not significant (p = 0.905). Similarly, the 
other main effect due to machines is tested by comparing the ratio 0.339 with 
the F distribution with (3, 6) degrees of freedom and this again 1s not significant 
(p = 0.798). Thus, we may conclude that there are no significant differences 
between the operators as well as between the machines, and also there is no 
evidence of any interaction between the two factors. 


Two-Way Crossed Classification with Interaction 247 


Furthermore, to assess the relative contribution of the variance components, 
we may obtain their estimates using formulae (4.7.20) through (4.7.23). Thus, 
we find that 


67 = 0.315, 
1 
oop = 5 (0.700 — 0.315) = 0.193, 


I 
64 = <(0.237 — 0.700) = —0.077, 


and 


1 
62 = ¢ (0.071 — 0.700) = —0.079. 


The negative estimates are an indication that the corresponding variance com- 
ponents may be zero. The results are consistent with the tests of hypotheses 
performed earlier. It is further evident that the larger part of the variability 
arises in the replication of measurements. 


4.16 WORKED EXAMPLE FOR MODEL III 


The following example is taken from an experiment described in Youden (1951, 
pp. 64-65). An experiment was performed to determine the effect of time aging 
on the strength of cement. Three mixes of cement were prepared and six spec- 
imens were made from each mix. Three specimens from each mix were tested 
after two days and later after seven days. The test specimens were two-inch 
cubes that yielded under the given load and were measured in units of 10 
pounds. The data are presented in Table 4.15. 

The experiment just described constitutes a mixed effects model. The mixes 
are random components, a sample of three drawn from a large number of mixes. 
The results of the experiment should be valid for the entire distribution of mixes. 
On the other hand, effects of aging are fixed effects. The conclusions of the ex- 
periment will reveal whether the yield loads differ after two or seven days, these 
periods being fixed. Hence, the data of Table 4.15 should be analyzed using 
Model III. Since there are three observations for each mix and aging combi- 
nation, this will enable the experimenter to test for the presence of interaction. 
Interaction terms cannot be ignored since it is quite possible that the three mixes 
differ after a long period of time without differing after a short period. In other 
words, the effect of an additional period of time is different for three mixes; 
that is, interaction is present. 


248 The Analysis of Variance 


TABLE 4.15 
Yield Loads for Cement Specimens 
Mix 

Aging Mix 1 Mix 2 Mix 3 
2-Day 574 524 576 
Test 564 573 540 
550 551 592 
7-Day 1,092 1,028 1,066 
Test 1,086 1,073 1,045 
1,065 998 1,055 


Source: Youden (1951, p. 65). Used with permission. 


The mathematical model for this experiment would be 


i-1,2 
Vijk =M+Q;,+ 8) + (AB); +e4ije YI=1,2,3 
k =1, 2, 3, 


where jz is general mean, a; is the effect of the i-th “aging” (S-_, a; = 0); B; is 
the effect of the j-th “mix” and is arandom variable assumed to be normally dis- 
tributed with mean zero and variance 0 33 (a@B);; is the interaction of the i-th “ag- 
ing” with the j-th mix and is a random variable assumed to be normally distri- 
buted with mean zero and variance at 20 (wB)i; =O, forj = 1, 2, 3); and 
€;;« 8 are experimental errors assumed to be independently and normally dis- 
tributed with mean zero and variance o. 


The following computations lead to the analysis of variance table. 


(1) The cell totals: 


yi. = 1,688, yi2, = 1,648, yy3, = 1,708, 
y21, = 3,243, yo = 3,099, y23, = 3,166. 


(11) The row totals: 
yy.. = 5,044, yo, = 9,508. 
(iu) The column totals: 


yi. =4,931, yo =4,747, ys, = 4,874. 


Two-Way Crossed Classification with Interaction 249 


(iv) The grand total: 


y,.. = 5,044 + 9,508 = 14,552. 


(v) 


3 
> > > Vik = (574)? + (564)7 + --- +(1,055)* = 12,882,026. 


i=1 j=l k=l 
(v1) 
? 14,552)° 
— ONO _ 11 764,483.5556. 
2x3x3 18 
(vil) 
3 2 _ 044)" + 0508)" _ 19 971.555.5556 
3x3 4 ee 
(viii) 


= 11,767,441.0000. 


_ (4,931) + (4,747) + (4,874) 
a ae 


Le) 
x — 
oS) 
[Me 
N< 
wow, N 


1,688 2 1,648 24... 3,166) 
_ (1,688)" + 1,648)" +--+ GB, 166)" 12,875,639.3333. 


1 2 3 ; 
3 Dy 3 


(x) SSr = 12,882,026 — 11,764,483.5556 = 1,117,542.4444. 
(x1) SSrce = 12,875,639.3333 — 11,764,483.5556 = 1,111,155.7777. 
(xn) SS, = 12,871,555.5556 — 11,764,483.5556 = 1,107,072.0000. 
(xi) SSg = 11,767,441.0000 — 11,764,483.5556 = 2,957.4444. 
(xiv) SSag = 1,111,155.7777 — 1,107,072.0000 — 2,957.4444 = 1,126.3333. 
(xv) SSe = 1,117,542.4444 — 1,111,155.7777 = 6,386.6667. 


These results along with the remaining analysis of variance computations 
are summarized in Table 4.16. Note that the numerical values of F tests are 
calculated differently here than in the case of Model I or Model II. The test 
for interaction is the same, but the test for random effects involves MSz/MSe 
and the test for fixed effects involves MS,/MS,gz. If we choose the level of 
significance a = 0.05, we find from Appendix Table V that 


F[1, 2;0.95] = 18.51 


250 The Analysis of Variance 


TABLE 4.16 
Analysis of Variance for the Yield Loads Data of Table 4.15 
Source of Degreesof Sum of Mean Expected 
Variation Freedom Squares Square Mean Square F Value p-Value 
Aging 1 1,107,072.0000 1,107,072.000 a? + 302, 

3x34 , 965 ; 

+5 2% 1,965.80 <0.00 

Mix 2 2,957.4444 —-:1,478.722 02 +2 x 303 2.78 0.102 
Interaction 2 1,126.3333 563.167 of + 30%, 1.06 0.377 
Error 12 6,386.6667 532.222 oa? 
Total 17 1,117,542.444 
and 


F[2, 12;0.95] = 3.89. 


Comparing these values with the computed F values given in Table 4.16, we 
may reach the following conclusions: 


(a) Do not reject the hypothesis of no interaction effects and conclude that 
the data do not give sufficient evidence of the existence of interaction 
between the “aging” and the “mixes” (p = 0.377). 

(b) Do not reject the hypothesis of no “mixes” effects and conclude that the 
mean strength of cement does not vary in the population of mixes. 

(c) Reject the hypothesis of no “aging” effects and conclude that there is a 
significant effect from the additional five days of aging. 


Furthermore, we can make a comparison of two- and seven-day aging effects 
by using Tukey’s and Scheffé’s methods of simultaneous confidence interval. 
For the Tukey’s procedure, we find from Appendix Table X that 


gla, (a — 1)(b — 1); 1 — a] = g[2, 2;0.95] = 6.09. 


So that 


MS 563.167 
gla, (a — 1)(b — 1);1 — a], — = 6.09,/ = 48.17 
bn 3x3 


and, from (4.12.13), a95 percent simultaneous confidence interval for a, — a2 
is given as 


(1,056.44 — 560.44) — 48.17 < a2 — a, < (1,056.44 — 560.44) + 48.17 


Two-Way Crossed Classification with Interaction 251 


or 
447.83 < a2 —a, < 544.17. 
For the Scheffé’s procedure, we find from Appendix Table V that 
S? = Fla — 1,(a—1)(b — 1);1 —@] = F[1, 2;0.95] = 18.51, 


so that 


S?(a — 1)(bn)~! MSag ) > €? = V'18.51(1)G x 3)-! (563.167)(2) = 48.13 


i=l 


and from (4.12.4) a 95 percent simultaneous confidence interval for a2 — a, is 
given as 


(1,056.44 — 560.44) — 48.13 < a2 — a, < (1,056.44 — 560.44) + 48.13 
or 
447.87 < a. —a, < 544.13. 


Notice that witha = 2, there is only one contrast and both Tukey’s and Scheffé’s 
procedures are equivalent to the usual tf test. 

Finally, suppose it is desired to determine the power of the test when the 
difference in time effect is as large as 400 psi. Since the test specimens are in 
units of 10 lbs, 400 psi corresponds to 40 units. Furthermore, since a, + a2 = 0, 
this gives a; = —20.0 and a2 = 20.0. Now, from Section 4.11, the normalized 
noncentrality parameter is 


a 
bn ) a? 
i=l 


a(o? + no) 
[3 x 3{(—20.0) + (20.0)?} 
2(563.167) 


= 2.53, 


where an estimate of a? + nog is obtained from MS, = 563.167. Since the 
Pearson-Hartley charts do not contain a power curve for v; = 1 and v2 = 2, 
we calculate the power using the noncentral ¢ distribution. The noncentrality 
parameter (5) for the noncentral ft distribution is determined as 6 = /ad = 
/2(2.53) = 3.58, Now, entering the Appendix Chart I witha = 0.05, df = 2, 


252 The Analysis of Variance 


and 6 = 3.58, the power is found to be about 0.48. The use of Appendix Tables VI 
and VII with appropriate interpolation gives essentially the same result. Notice 
that a very small number of degrees of freedom for the ¢ test makes it quite 
Insensitive. 


4.17 USE OF STATISTICAL COMPUTING PACKAGES 


As in Section 3.16, for a two-way fixed effects analysis of variance with an equal 
number of observations and no missing values, the recommended procedure is 
SAS ANOVA. If nj;;’s are unequal because of only few missing values, they 
could be replaced by their respective cell means and the data could be analyzed 
as in the preceding. However, if there is a wide disparity between nj;’s, the 
GLM procedure should be used. The GLM produces Type I, Type II, Type II, 
and Type IV sums of squares. When some cells are empty, caution should be 
used in choosing an appropriate sum of squares. For a random or mixed model 
analysis, GLM with RANDOM or TEST option should be used. For equal n;;’s 
in each cell, estimates of variance components can be readily obtained from the 
entries of the analysis of variance table. For unequal n;;’s, PROC MIXED or 
VARCOMP must be used for estimating variance components. For the details 
of SAS commands, see Section 11.1. 

Among SPSS procedures, the ANOVA would be a better choice for fixed 
effects analysis involving a balanced layout. For the design involving an un- 
equal number of observations per cell and arandom and mixed model analysis, 
GLM or MANOVA must be used. For the estimation of variance components, 
VARCOMP (available in Release 7.0 and 8.0) is the procedure of choice. For 
instructions regarding SPSS commands, see Section 11.2. 

In using BMDP programs, as indicated in Section 3.15, the two programs 
suited for this model are 7D and 2V if the analysis involves only fixed effects 
in the model. However, when the number of observations 1n each cell is rather 
large, 7D would be a better choice since it could provide comparative histograms 
and descriptive statistics for data in each cell. Similar to GLM, 2V is a general 
purpose program for performing fixed effects analysis of variance for both 
balanced and unbalanced data sets. For the analysis involving random and mixed 
effects models, 3V and 8V can be used. For designs with equal n;;’s in each cell, 
8V is recommended since it is simpler to use. For unequal n;;’s, 3V would be the 
preferred choice. This program also provides estimates of variance components 
using maximum likelihood and restricted maximum likelihood procedures. 

Utmost care should be exercised in using packaged programs for unequal 
sample sizes and when some cells are empty. It is important to find out how the 
individual program or procedure handles the empty cells and the assumptions it 
makes about the interaction terms. The user should make sure that the program 
outputs the appropriate sums of squares for the tests of hypotheses of interest. 
For some further discussion and details in this regard, see Milliken and Johnson 
(1992, Chapter 14). 


Two-Way Crossed Classification with Interaction 253 


4.18 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the applications of statistical packages to perform 
two-way analysis of variance with interaction for the data sets employed in 
examples presented in Sections 4.13 through 4.16. Figures 4.1 through 4.4 
illustrate the program instructions and the output results for analyzing data 
given in Tables 4.9, 4.11, 4.13, and 4.15, using SAS ANOVA/GLM, SPSS 
MANOVA/GLM, and BMDP 7D/2V/8V procedures. The typical output pro- 
vides the data format listed at the top, cell means, and the entries of the analysis 
of variance table. It should be noticed that in each case, the results are the same 
as those provided using manual computations in Sections 4.13 through 4.16. 
However, note that certain tests of significance in a mixed model may differ 
from one program to the other since they make different model assumptions. 


4.19 THE MEANING AND INTERPRETATION 
OF INTERACTION 


In the discussion of the two-way model (4.1.1), we have assumed the existence 
of interaction to take into account the fact that the two factors may not be 
independent; that is, the effects of one factor may vary with the levels of the 
other factor. Thus, for example, suppose that the yield of a chemical process 
depends on two factors: the concentration of the chemical and the operating 
temperature. Now, if the yield at different concentration levels varies with the 
level of the operating temperature, we would say that interaction is present. 
The lack or presence of interaction is marked by parallelism or nonparallelism . 
in the plots of average treatment responses. For example, consider two levels 
for each factor A and B, denoted by (A), Az) and (B;, Bz), respectively. Some 
possible patterns for observed cell means and presence or lack of interactions 
are illustrated in Figure 4.5. The graphical illustrations allow a visual inspection 
of factor effects and their interactions. Any nonparallel change in the average 
response is an indication of the presence of an interaction. 

The existence or nonexistence of interaction effects, as inferred from the 
F test of interaction, can have very important bearing on how one interprets 
and uses the results of an experiment. When two factors A and B interact, an 
important question arises as to whether the main effects of A and B are mean- 
ingful measures to interpret. Thus, if the hypothesis of interaction is rejected, 
we may conclude that the effects of A and B are not additive; that is, factors 
A and B interact. If this happens, testing the significance of A and B factor 
effects becomes meaningless under the present formulation of the model. Note 
that accepting the hypothesis about the A factor effects means that there are 
no differences in the various levels of A when averaged over the levels of B. 
However, in the presence of interaction this interpretation is meaningless. The 
presence of interaction means that the effect of one factor 1s dependent on the 
levels of the other. Similarly, rejecting the hypothesis about the A factor effects 
when interaction is present is also meaningless. The same argument holds true 


254 The Analysis of Variance 


DATA FITNESS; The SAS System 

INPUT AGE FITNESS RUNNING; Analysis of Variance Procedure 

DATALINES; 

11 669 Dependent Variable: RUNNING 

11 671 Sum of Mean 

12 602 Source DF Squares Square E Value Pr > F 


3 3 713 Model 8 378352.00 47294.00 147.38 0.0001 
; Error 9 2888.00 320.89 

PROC ANOVA; Corrected 17 381240.00 

CLASSES AGE FITNESS; Total 

MODEL RUNNING=AGE FITNESS R-Square C.V. Root MSE RUNNING Mean 
AGE* FITNESS; 0.992425 2.508876 17.913 714.00 
RUN; 

CLASS LEVELS VALUES Source DF Anova SS Mean Square F Value Pr > F 
AGE 3 12 3 

FITNESS 3 12 3 AGE 2 195733.00 97866.50 304.99 0.0001 
NUMBER OF OBS. IN DATA FITNESS 2 160867.00 80433.50 250.66 0.0001 
SET=18 AGE*FITNESS 4 21752.00 5438.00 16.95 0.0003 


(i) SAS application: SAS ANOVA instructions and output for the two-way fixed effects analysis 


of variance with two observations per cell. 


| DATA LIST Analysis of Variance-Design 1 
/AGE 1 FITNESS 3 
RUNNING 5-8. Tests of Significance for RUNNING using UNIQUE sums of squares 
BEGIN DATA. 
1 1 669 Source of Variation ss DF MS FE Sig of F 
11 671 
ee WITHIN+RESIDUAL 2888. 9 320.89 
3 3 713 AGE 195733. 2 97866.50 304.99 .-000 
END DATA. FITNESS 160867. 2 80433.50 250.66 .000 
MANOVA RUNNING BY AGE BY FITNESS 21752. 4 5438.00 16.95 -000 
1AGE(1,3) FITNESS 
(1,3) (Model) 378352. 47294.00 147.38 .000 
| /DESIGN=AGE (Total) 381240. 22425.88 


992 Adjusted R-Squared = .986 


(ii) SPSS application: SPSS MANOVA instructions and output for the two-way fixed effects 


analysis of variance with two observations per cell. 


/INPUT FILE='C: \SAHAI BMDP7D - ONE- AND TWO-WAY ANALYSIS OF VARIANCE WITH 
\TEXTO\EJE8.TXT'. DATA SCREENING Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=3. | ANALYSIS OF VARIANCE 

/VARIABLE NAMES=AGE,FIT,RUN. | | SOURCE SUM OF SQUARES DF MEAN SQUARE F VALUE PROB. 

/GROUP VARIABLE=AGE, FIT. 
CODES (AGE) =1, 2,3. 195733.0000 97866.5000 
NAMES (AGE) =A40,A50 | | FITNESS 160867 .0000 80433.5000 

,A60. {INTERACTION 21752.0000 5438 .0000 
CODES (FIT)=1,2,3. | ERROR 2888.0000 320.8889 
NAMES (FIT) =L,M,H. ; 

/HISTOGRAM GROUPING=AGE, FIT. ANALYSIS OF VARIANCE; 
VARIABLE=RUNNING. VARIANCES ARE NOT ASSUMED TO BE EQUAL 

/END | WELCH 4 1031.90 

1 1 669 | BROWN-FORSYTHE 

11 671 | AGE 

oe | FITNESS 

3 3 713 | INTERACTION 


(iii) BMDP application: BMDP 7D instructions and output for the two-way fixed effects analysis 


of variance with two observations per cell. 


FIGURE 4.1 Program Instructions and Output for the Two-Way Fixed Effects 
Analysis of Variance with Two Observations per Cell: Data on Running Time (in 
seconds) to Complete a 1.5 Mile Course (Table 4.9). 


Two-Way Crossed Classification with Interaction 255 


1 DATA RATWEIGT; The SAS System 
S INPUT SOURCE QUANTITY GAINS; General Linear Models Procedure 
h DATALINES; Dependent Variable: GAINS 

11 

Sum of Mean 

Source DF Squares Square F Value Pr >F 
Model 5 4807.2934 961.4587 4.86 0.0012 
Error 46 9105.0143 197.9351 
oe Corrected 51 13912.3077 
3 2 Total 
PROC GLM; R-Square c.V. Root MSE GAINS Mean 
CLASSES SOURCE QUANTITY; 0.345542 16.05761 14.069 87.615 
MODEL GAINS=SOURCE QUANTITY Source DF Type I SS Mean Square F Value Pr > F 
SOURCE* QUANTITY; SOURCE 302.1077 151.0538 0.76 0.4720 
RUN; QUANTITY 2771.9325 2771.9325 14.00 0.0005 
CLASS LEVELS VALUES SOURCE* QUANTITY 2 1733.2532 866.6266 4.38 0.0182 
SOURCE 3 12 3 Source F Type III SS Mean Square F Value Pr > F 
QUANTITY 2 12 SOURCE 2 228.7266 114.3633 0.58 0.5652 
1 
2 


NUMBER OF OBS. IN DATA QUANTITY 3274.4985 3274.4985 16.54 0.0002 
SET=52 SOURCE* QUANTITY 1733.2532 866.6266 4.38 0.0182 


(i) SAS application: SAS GLM instructions and output for the two-way fixed effects analysis of 


11 
11 
11 
11 


variance with unequal numbers of observations per cell. 


DATA LIST Analysis of Variance-Design 1 
/SOURCE 1 QUANTITY 3 
GAINS 5-7. Tests of Significance for GAINS using UNIQUE sums of squares 


Source of Variation ss DF MS F Sig of F 


WITHIN+RESIDUAL 9105. 46 197.94 

SOURCE 228. 2 114.36 . 965 
QUANTITY 3274. 1 3274.50 -000 
SOURCE BY QUANTITY 1733. 2 866.63 -018 


MANOVA GAINS BY (Model) 4807. 5 961.46 . 001 
SOURCE (1,3) QUANTITY (1,2) (Total) 13912. 51 272.79 

/DESIGN=SOURCE QUANTITY 

SOURCE BY QUANTITY. R-Squared =. Adjusted R-Squared = .274 


(ii) SPSS application: SPSS MANOVA instructions and output for the two-way fixed effects 
analysis of variance with unequal numbers of observations per cell. 


FILE='C: \SAHAI BMDP2V - ANALYSIS OF VARIANCE AND COVARIANCE WITH 

\TEXTO\EJE9.TXT'. REPEATED MEASURES. Release: 7.0 (BMDP/DYNAMIC) 

FORMAT=FREE . 

VARIABLES=3. ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 
/VARIABLE NAMES=SOURCE, 

QUANTITY, GAINS 
VARIABLE=S,Q. THE TRIALS ARE REPRESENTED BY THE VARIABLES:GAINS 
CODES (SOURCE) =1, 2,3. 


NAMES (SOURCE) =B, P,C. SOURCE SUM OF D.F. MEAN 
CODES (QUANTITY)=1,2. SQUARES SQUARE 
NAMES (QUANTITY) =H, L. 
/DESIGN DEPENDENT=GAINS. MEAN 393454.04800 
/END SOURCE 228,.72657 


1 393454.04800 1987.79 
2 114.36328 0.58 
11 81 QUANTITY 3274.49851 1 3274.49851 16.54 
soe SQ 1733.25321 2 866.62660 4.38 
3 2 107 ERROR 9105.01429 46 197.93509 


(iii) BMDP application: BMDP 2V instructions and output for the two-way fixed effects analysis 
of variance with unequal numbers of observations per cell. 


FIGURE 4.2 Program Instructions and Output for the Two-Way Fixed Effects 
Analysis of Variance with Unequal Numbers of Observations per Cell: Data on 
Weight Gains of Rats under Different Diets (Table 4.11). 


256 The Analysis of Variance 


1#DATA SCREENLT; The SAS System 

| INPUT OPERATOR MACHINE General Linear Models Procedure 

LENGTHS; Dependent Variable: LENGTHS 

DATALINES; Sum of Mean 
1 36.3 Source DF Squares Square F Value Pr > F 
1 35.2 Model 11 5.0512500 0.4592045 1.46 0.2626 
2 36.7 Error 12 3.7750000 0.3145833 
2 36.1 Corrected 23 8.8262500 


eee Total 
4 36.1 R-Square c.V. Root MSE LENGTHS Mean 
0.572299 1.566149 0.5609 — 35.813 
PROC GLM; Source DF Type III SS Mean Square F Value Pr > F 
| CLASSES OPERATOR MACHINE; | OPERATOR 2 0.1425000 0.0712500 0.23 0.8007 


| MODEL _LENGTHS=OPERATOR MACHINE 3 0.7112500 0.2370833 0.75 0.5411 

| MACHINE. OPERATOR*MACHINE; | OPERATOR*MACHINE 6 4.1975000 0.6995833 2.22 0.1125 

RANDOM OPERATOR MACHINE Source Type III Expected Mean Square 

OPERATOR* MACHINE; OPERATOR Var(Error) + 2 Var(OPERATOR*MACHINE) 

TEST H=OPERATOR + 8 Var (OPERATOR) 

1 E=OPERATOR*MACHINE; MACHINE Var(Error) + 2 Var (OPERATOR*MACHINE) 

| TEST H=MACHINE + 6 Var (MACHINE) 

E=OPERATOR*MACHINE; OPERATOR*MACHINE Var (Error) + 2 Var (OPERATOR*MACHINE) 

RUN; Tests of Hypotheses using the Type III MS for 

| CLASS LEVELS VALUES OPERATOR*MACHINE as an error term 

1 OPERATOR 3 12 3 Source DF Type III SS Mean Square F Value Pr > F 

| MACHINE 4 1 2 3 4 | OPERATOR 2 0.1425000 0.0712500 0.10 0.9047 

1 NUMBER OF OBS. IN DATA Source DF Type III SS Mean Square F Value Pr > F 
MACHINE 3 0.7112500 0.23708333 0.34 0.7985 


(i) SAS application: SAS GLM instructions and output for the two-way random effects analysis 


of variance with two observations per cell. 


Tests of Between-Subjects Effects Dependent Variable: LENGTHS 
/OPERATOR 1 
MACHINE 3 Source Type III SS df Mean Square F Sig. 
LENGTHS 5-8 (1) OPERATOR Hypothesis -142 2 7.125E-02 - 102 905 
BEGIN DATA. Error 4.197 6 - 700 (a) 
36.3 MACHINE Hypothesis -711 3 .237 339 -798 
35.2 Error 4.197 6 - 700 (a) 
36.7 OPERATOR* Hypothesis 4.197 6 .700 2.224 ~112 
36.1 MACHINE Error 3.775 1 -315 (b) 
35.1 a MS(OPERATOR*MACHINE) b MS (Error) 
35.2 
35.2 Expected Mean Squares (a,b) 
. . Variance Component 
1 3 36.1 Source Var (0) Var (M) Var (O*M) Var (Error) 
SEND DATA. OPERATOR 8.000 -000 2.000 1.000 
GLM LENGTHS BY MACHINE -000 6.000 2.000 1.000 
OPERATOR MACHINE OPERATOR* MACHINE .000 .-000 2.000 1.000 
/DESIGN OPERATOR Error . 000 .000 .000 1.000 
MACHINE a For each source, the expected mean square equals the sum of the 
OPERATOR* MACHINE coefficients in the cells times the variance components, plus a 
/RANDOM OPERATOR quadratic term involving effects in the Quadratic Term cell. 
MACHINE. b Expected Mean Squares are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the two-way random effects 


analysis of variance with two observations per cell. 


FIGURE 4.3. Program Instructions and Output for the Two-Way Random Effects 
Analysis of Variance with Two Observations per Cell: Data on Screen Lengths 
from a Quality Control Experiment (Table 4.13). 


about the effects of factor B. Thus, when stating the effects of one factor it is 
necessary to specify the level of the other. This is the most important meaning 
of the interaction, namely, when interactions are present, the factors themselves 
cannot be evaluated individually. The presence of interactions requires that the 
factors be evaluated jointly rather than individually. 


Two-Way Crossed Classification with Interaction 257 


” FILE=C: \SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE10.TXT’. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=2. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
|) /VARIABLE NAMES=L1,L2. SOURCE ERROR SUM OF D.F. MEAN 
/ DESIGN NAMES=O0,M,L. TERM SQUARES SQUARE 

LEVELS=3,4, 2. MEAN -0780843E+4 30780.84348 

RANDOM=O,M, L. OPERATOR OM 0.07125 0.10 0.9047 
0.23708 0.34 0.7985 
0.69958 2.22 0.1125. 
0.31458 


-1424993E+0 


-1974911E+0 
- 7749950E+0 


3 1 
0 2 
MODEL='0,M,L(OM)'. 0.7112480E+0 3 
4 6 
3 1 


2 


SOURCE EXPECTED MEAN ESTIMATES OF 
SQUARE VARIANCE COMPONENTS 
ANALYSIS OF VARIANCE DESIGN MEAN 24 (1) +8 (2) +6(3)+2 (4)+(5) 1282.55145 
INDEX Oo -M L OPERATOR 8(2)+2 (4)+(5) -0.07854 
NUMBER OF LEVELS 3 4 2 6(3)+2 (4) +(5) -0.07708 
POPULATION SIZE INF INF INF 2 (4) +(5) 0.19250 
O, M, L(OM) 0.31458 


(iii) BMDP application: BMDP 8V instructions and output for the two-way random effects 


analysis of variance with two observations per cell. 


FIGURE 4.3 (continued) 


Significant interactions serve as a warning: treatment differences possibly 
do exist, but to specify exactly how the treatments differ, one must look within 
the levels of the other factor. The presence of the interaction effects is a signal 
that in any predictive use of the results, effects ascribed to a particular treatment 
representing one factor are best qualified by specifying the level of the other 
factor. This is especially important if one is going to try to use estimated effects 
in forecasting the result of a treatment to an experimental unit. If interaction 
effects are present, the best forecast can be made only if the particular levels of 
both factors are known. 

When the observations suggest the presence of significant interactions, it 
is important to determine whether large interactions really do exist or whether 
there may be some other reasons for the presence of the interactions. Often large 
interactions may exist as a result of the dependent variable being measured on 
an inappropriate scale, and the use of a simple transformation may remove 
most of the interaction effects. Some simple transformations that are helpful 
in reducing the importance of interactions include the logarithmic, reciprocal, 
square, and square-root transformations (see Section 2.22 for a discussion of 
these transformations). | 

Sometimes, the investigator may think that there are no interactions; however, 
the data obtained may indicate a considerable amount of interactions. This 
could possibly happen purely by chance variation. On the other hand, such 
unexpectedly large interactions may simply occur due to the presence of outliers 
(observations much different from the rest of the data). The entire interaction 
may depend upon just one observation that may be wrong or an outlier. One 
should look at the data more carefully for the presence of an outlier before 
discarding them. If after further examination, the data look normal, there is 
the possibility of some complicated and unsuspected phenomenon that may 
require investigation. If the observations were not made using some random 
device, considerable time effects may be embedded in the data obtained. Thus, 


258 The Analysis of Variance 


DATA YIELDLOD; The SAS System 
INPUT AGING MIX YIELD; General Linear Models Procedure 
DATALINES; Dependent Variable: YIELD 
574 Sum of Mean 
564 Source DF Squares Square F Value 
550 Model 5 1111155.7778 222231.1556 417.55 0.0001 
524 Error 12 6386.6667 §32.2222 
573 Corrected 17 1117542.4444 


3 1055 R-Square c.V. Root MSE YIELD Mean 
0.994285 2.8536212 23.069942 808.44444444 


PROC GLM; DF Type III ss Mean Square F Value Pr > F 
CLASSES AGING MIX; 1 1107072.000 1107072.000 2080.09 0.0001 
MODEL YIELD=AGING MIX 2 2957.444 1478.722 2.78 0.1020 
AGING*MIX; 2 1126.333 563.167 1.06 0.3774 
RANDOM MIX AGING*MIX; Type III Expected Mean Square 
TEST H=AGING E=AGING*MIX; Var(Error) + 3 Var(AGING*MIX) + Q(AGING) 
Var (Error) + 3 Var(AGING*MIX) + 6 Var (MIX) 
* LEVELS VALUES AGING*MIX Var(Error) + 3 Var (AGING*MIX) 

2 Tests of Hypotheses using the Type III MS for AGING*MIX as 

3 an error term 
NUMBER OF OBS. IN DATA Source DF Type III SS Mean Square F Value Pr > F 
SET=18 1 1107072.000 1107072.000 1965.80 0.0005 


(1) SAS application: SAS GLM instructions and output for the two-way mixed effects analysis 
of variance with three observations per cell. 


| 


Tests of Between-Subjects Effects Dependent Variable: YIELD 


Type III SS df Mean Square F 
BEGIN DATA.., Hypothesis 1107072 .000 1 1107072.000 1965.798 
574 Error 1126.333 2 . 167 (a) 
564 Hypothesis 2957.444 2 -722 2.626 
550 Error 1126.333 2 .167 (a) 
524 AGING*MIX Hypothesis 1126.333 2 167 1.058 
573 Error 6386.667 12 222 (b) 
a MS(AGING*MIX) b MS (Error) 
Expected Mean Squares (a,b) 
Variance Component 
Var (MIX) Var (AGING*MIX) Var (Error) Quadratic Term 
.000 3.000 1.000 
6.000 3.000 1.000 
AGING*MIXxX -000 3.000 1.000 
Error -000 -000 1.000 
a For each source, the expected mean square equals the sum of the coeff- 
icients in the cells times the variance components, plus a quadratic term 
! MIX AGING*MIX involving effects in the Quadratic Term cell. b Expected Mean Squares are 
| /RANDOM MIX. based on the Type III Sums of Squares. 


© WWWNODD Bee 


(ii) SPSS application: SPSS GLM instructions and output for the two-way mixed effects analysis 
of variance with three observations per cell. 


FIGURE 4.4 Program Instructions and Output for the Two-Way Mixed Effects 
Analysis of Variance with Three Observations per Cell: Data on Yield Loads for 
Cement Specimens (Table 4.15). 


the error terms can no longer be assumed to be uncorrelated. In some cases, 
an uncontrolled variable may affect the results of the observations showing the 
presence of interactions. For example, in a laboratory experiment involving 
mice, the location of the cage may have an effect on the outcome and if this 
factor was left uncontrolled or the mice were not randomly assigned, we might 
observe an apparent interaction when there was none. 

It has also been found that interactions frequently occur when the main effects 
are large. Interactions usually become less important by reducing the differences 
among the levels of treatment, and thus moderating the size of the main effects. 


Two-Way Crossed Classification with Interaction 259 


/ INPUT FILE='C: \SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE11.TXT’. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
VARIABLES=3. 

/VARIABLE NAMES=Y1, Y2, Y3. SOURCE ERROR SUM OF D.F. MEAN 

/DESIGN NAMES=AGING, MIX, TERM SQUARES SQUARE 

YIELD. MEAN MIX 11764483. 11764483.6 7955.84 0.0001 
LEVELS=2, 3,3. AGING AM 1107072. 1107072.0 1965.80 0.0005 


FIXED=AGING. AM Y (AM) 1126. 563.2 1.06 0.3774 
MODEL='A,M, Y(AM)°. Y (AM) 6386. 2 532.2 
SOURCE EXPECTED MEAN ESTIMATES OF 
564 550 SQUARE VARIANCE COMPONENTS 

573 551 MEAN 18(1)+6(3)+(5) 653500.26852 
540 592 AGING 9(2)+3(4)+(5) 122945 .42593 
1086 1065 MIX 6(3)+(5) 157.75000 
1073 998 AM 3(4)+(5) 10.31481 
1045 1055 Y (AM) (5) 532.22222 


1 
1 
RANDOM=MIX. MIX Y (AM) 2957. 2 1478.7 2.78 0.1020 | 
2 
1 


(iii) BMDP application: BMDP 8V instructions and output for the two-way mixed effects 


analysis of variance with three observations per cell. 


FIGURE 4.4 (continued) 


For this reason, the presence of interaction effects can be most important to the 
interpretation of the experiment. Although it is necessary to consider possible 
interaction effects even in fairly simple experiments, the subject of interaction 
and of the interpretation that should be given to significant tests for interaction 
is neither simple nor fully explored. For a broad review of various aspects of 
interactions, see Cox (1984). 


4.20 INTERACTION WITH ONE OBSERVATION PER CELL 


In the discussion of model (3.1.1) in Chapter 3, we had assumed that there are 
no interaction terms. If the existence of interaction is assumed, model (3.1.1) 
becomes 


Vij = +O + Bj + (AB)ij + ij, (4.20.1) 


where [L, a@;’s, B;’s, and e;;’s are defined as in model (3.1.1), and (@);;’s are the 
interaction effects between factors A and B, which are assumed to be constant 
under Model I and are randomly distributed with mean zero and variance Cis 
under Models II and III. Proceeding as before, the pertinent analysis of variance 
can be derived and is shown in Table 4.17. 

On comparing Tables 3.2 and 4.17, it is seen that they are same except for the 
differences in the expressions of expected mean square column. In Table 4.17 
if we let (@B);; = 0 or Ocp = (0 and change the word interaction to error, we 
have exactly the same analysis of variance table as in Table 3.2. However, if the 
assumption of no interaction is not tenable, we have the following inferential 
problems. Under Model I, no direct tests are possible since the hypothesis that 
either of the effects a;’s and 6;’s or the interaction (@f),;’s are zero gives 
us no suitable mean squares to compare. When the interactions are present, 
SS, has anoncentral chi-square distribution and the F ratios MS, /MS,4 z and 


260 The Analysis of Variance 


x 
Ay Ao A) A2 
(c) (d) 
— B, 
B 
Vij Vij. >< 
B Z - B ; 
x x 
Ay A2 Ay A2 
(e) (f) B2 
Bi 
Bo B, 
Vij Vij. 
e e 
e e 
x xX 
Aj A2 Aj A2 


FIGURE 4.5 Patterns of Observed Cell Means and Existence or Nonexistence 
of Interaction Effects: (a) No effect of factor A, large effect of factor B, and no 
AB interaction; (b) Large effect of factor A, moderate effect of factor B, and no 
AB interaction; (c) No effect of factor A, large effect of factor B, and large AB 
interaction; (d) No effect of factor A, no effect of factor B, but large AB interaction; 
(e) Large effect of factor A, no effect of factor B, with small AB interaction; (f) 
Large effect of factor A, small effect of factor B, with small AB interaction. (The 
graphs (a) to (f) are obtained by representing the levels of factor A as values on the 
x-axis and plotting the cell means at those levels as values on the y-axis. Separate 
curves are drawn for each level of factor B. Alternatively, one could represent the 
levels of B on the x-axis and separate curves drawn for each level of factor A.) 


261 


Two-Way Crossed Classification with Interaction 


I=) _ 
Po 4 20 tle i 


fop + dn z2 +7 20 
a I 
d _— 
oq + 0+ 20 eis 
li |PPOW | jepow 


asenbs uray payadxq 


qxv 


(I-91 — 9)  uoljoerajuy 


gq ov ang 


y o} and 


UONLLILA 
JO 304n0S 


(L°OZ'P) JPPOW 40} aduBLIRA JO sisAjeuy 


Lip d1aVl 


262 The Analysis of Variance 


MSg/MSaz have doubly noncentral F distributions.* The usual F tests for 
main effects a;’s and B;’s may be inefficient if there is appreciable interaction 
so that }o7_, yj (@B)?, # 0, since the denominator mean square will be 
inflated by the extra component. On the other hand, if either variance ratio is 
significant, it may be taken that the corresponding effect is real. 

Even though there are no direct tests for interaction effects, Tukey (1949b) 
has devised a test that may be used for testing the existence of interaction terms. 
The null hypothesis and the alternate are:? 


Ao : (a@B)i; = 9, 1=1,2,...,a;j=1,2,...,b 
versus (4.20.2) 
A, : not all (@B);;’s are zero. 


The procedure requires the computation of the sum of squares for nonadditivity 
defined by 


ab 2 
bs » Vig Vi. — VIOG - 50] 


SSy = SS (4.20.3) 


a b 
YG. — 9. S065 - 9. 
i=1 j=l 


It can be shown that under Ho, the statistic 


Fe = S8N /__S8aa —SSw (4.20.4) 
1 (a—1)(b-—1)-1 
is distributed as F'[1, (a — 1)(b — 1) — 1] variable. Note that there is one degree 
of freedom associated with SSy and (a — 1)(b — 1) — 1 = ab — a — b degrees 
of freedom are associated with SS4z — SSy. Thus, a large value of F* leads to 
the rejection of Ho. 
For computational purposes, the SSy term can be further simplified by ex- 
panding the numerator in (4.20.3) into four terms and then rearranging them as 
follows 


a b a y? b y2 y? 2 
> 4 HIYLII ~ Y- La thay ab 


i=1 j= i=1 


oN = abSSaSSs) —SCS~s—‘CSSCi‘ 


The first term in the numerator of (4.20.5), that is, )°_, iat Vip Vij, can 


8 For a definition of the doubly noncentral F distribution, see Appendix H. 

9 Technically speaking the interaction hypothesis in (4.20.2) is incorrect. Tukey (1949b) consid- 
ered inreractions of the form (wB);; = Ga; B; with G fixed but unknown, leading to one-degree- 
of-freedom test of the interaction hypothesis Hp : G = 0. 


Two-Way Crossed Classification with Interaction 263 


be more easily calculated by rewriting it as }-;_, yi yijy.j]. The sec- 
ond term within parentheses in the numerator of (4.20.5) is equivalent to 
SS4+SSzg + y?/ab. 

The test is commonly known as Tukey’s one degree of freedom test for 
nonadditivity. It is discussed in detail by Scheffé (1959, pp. 129-134) and Rao 
(1973, pp. 249-255) and a numerical example appears in Ostle and Mensing 
(1975, Section 11.3). 


Example 1. We illustrate Tukey’s test for the data of Table 3.3. To cal- 
culate the test statistic (4.20.4), we obtain 


a b 
SY vi Gi. — F_IG5 — J. 


i=l j=l 
— 241(261.67 — 239.5)(226.25 — 239.5) 
4... 4+227(232.33 — 239.5)(237.25 — 239.5) 
— —2,332.1405, 


a SS, 3,141.667 
YG; - 9.7 = A = = = 1,047.2223, 
i=l 


b 

SS» 1,683.500 
S55 — FP = A = 2 = 420.8750. 
j=l 4 4 


(—2,332.1405)? 
Hence, from (4.20.3), we have SSy = ——— 2. = 12.340. 
ence, from (4.20.3), we have SSw =~ 01775993(420.87750) 


Finally, we obtain the test statistic (4.20.4) as 


= 0.05. 


_ Se Puen — 12.340 


1 6-1 


Assuming the level of significance at a = 0.05, we obtain F[1, 5; 0.95] = 
6.61. Since F* < 6.61, we may conclude that material and position do not 
interact (p = 0.832). The use of the no-interaction model for the data in 
Table 3.3, therefore, seems to be reasonable. 


Remarks: (i) The power function of Tukey’s test for nonadditivity has been studied by 
Ghosh and Sharma (1963), and Hegemann and Johnson (1976). Milliken and Graybill 
(1971) developed tests for interaction in the two-way model with missing data. In addi- 
tion, a variety of tests that are sensitive to particular non-additivity structures have also 
been proposed in the literature (see, e.g., Mandel (1971), Hirotsu (1983), Miyakawa 
(1993)). Krishnaiah and Yochmowitz (1980), Johnson and Graybill (1972a), and Bolk 


264 The Analysis of Variance 


(1993) provide reviews of additivity tests. For a generalization of Tukey’s test for a two- 
way classification to any general analysis of variance or experimental design model, see 
Milliken and Graybill (1970). 

(i1) If Tukey’s test shows the presence of interaction effects, some simple transforma- 
tions such as a square-root or logarithmic transformation may be employed to see if the 
interaction can be removed or made negligible. Johnson and Graybill (1972b) discuss 
an approximate method of analysis in the presence of interaction effects. 

(iii) Tukey’s test of nonadditivity can be performed using SAS GLM procedure by 
first fitting a two-way model with factors A and B as sources. The predicted values 
from the fitted model are squared and then the procedure is run again with the MODEL 
statement that includes A, B and square predictions as sources. The square predictions 
do not appear in the class statement and appear as the last term of the MODEL statement 
(as a covariate). The following SAS codes illustrate the procedure: 


data tukey; data ft; 

inputa b y; set f; 

datalines; p2 = pred * pred; 
a proc glm; 

proc glm; class a b; 

classa b; model y=a b p?; 
model y=a b; run; 


output out = ¢ t = pred; 


Under Model II, it is possible to test for the main effects (1. e., o2 = 0 
or oR = () by dividing their respective mean squares by the interaction mean 


square.!° However, we have lost our F test for the hypothesis O56 = 0, although 


a 

Tukey’s one degree of freedom test for nonadditivity described in the foregoing 
can also be used here. In regard to the point estimation of variance components, 
the estimators (3.7.17) and (3.7.18) are still unbiased for o2 and Of, but our 
estimate (3.7.16) of the error variance is biased since E(MSz) = o? + O5p. 
There is not much one can do about the MS, being a biased estimator of o2 
except to assume that Oop = 0; if the assumption were incorrect, one would be 
erring on the conservative side because MS¢ would tend to overestimate a2. 


Similar remarks apply for Model III. 


4.21 ALTERNATE MIXED MODELS 


As remarked in Section 4.2, several different types of mixed models have been 
proposed in the statistical literature. Among other models proposed are those 
by Tukey (1949a), Wilk and Kempthorne (1955, 1956), Scheffé (1956b), and 
Smith and Murray (1984).'! These models differ from the “standard” mixed 
model, discussed earlier in this chapter, in terms of the assumptions about the 
random effects B;’s and (@f);;’s. In this section, we briefly describe one of these 


10 For a discussion of the robustness of the tests for o2 and OF see Tan (1981). 
'1 Smith and Murray (1984) proposed a model that employs covariance components to allow 
negative correlations among observations within the same cell. 


Two-Way Crossed Classification with Interaction 265 


TABLE 4.18 
Analysis of Variance for an Alternate Mixed Model 
Source of Degrees of Sum of = Mean Expected 
Variation Freedom Squares Square Mean Square 

2 2 bn_ wo 
Due to A — atl SSA MSa Oo, +no yg + —— )_ ai 

wales 
Due to B b—1 SSB MSz of + NOs + ano; 
Interaction (a — 1)(b— 1) SSAB MSap Oo; +noZ, 
AxB 

Error ab(n — 1) SSr MSe a? 


alternate models. Suppose that the @;’s are fixed effects such that }°"_, a; = 0 
and £;’s are independently distributed normal random variables with mean 
zero and variance Op. The interaction effects (@B);;’s are also independently 


distributed normal random variables with mean zero and variance O53 and 
(a@B);;'s are independent of the B;’s. Note that the main difference between 
this mixed model and the “standard” mixed model discussed earlier is the 
assumption about the independence of the interaction effects.'* The analysis of 
variance and expected mean squares for this model are shown in Table 4.18. On 
comparing this table to Table 4.2, we note that the only noticeable difference is 
the inclusion of the variance component O5p in the expected mean square for the 
random effects which does not appear in the “standard” model. (In fact, there 
are some other minor differences due to different definitions of the variance of 
the interaction effects in the two models, but these do not affect the analysis.) 
Under this model, the hypothesis 


Hy Of =0 versus H,? Op > 0 
would be tested by the statistic 


nn . MSs 
°~ MSas 


in contrast with the statistic Fg = MSp/MSz, used in the “standard” model. 
The test is usually more conservative than the one based on the “standard” 
model, since MS, z will in general be larger than MS<. Again, the analysis of 
variance procedure may be used to estimate the variance components. From the 
mean square column of Table 4.18, we find that the only variance component 
which would have different estimates from the ones obtained in the “standard” 


12 The major criticism of the mixed model described here concerns the assumption of independence 
of the (af); ;’s since it is felt that these random terms within a given level of the factor B will 
often be correlated. 


266 The Analysis of Variance 


model is «7, which is now estimated by 


~2  MSp— MSap 
64 = —————. 
an 
The hypothesis concerning the fixed effects factor is 
Hj :a; =Oforalli versus Hy’: a; 4 0 for at least one i. 
An exact a-level test of H;! is to reject Hg if MS4/MS,z > Fla — 1, (a — 1) 
(b — 1); 1 — a]. Notice that this is the same test as the one under the ‘standard’ 


model discussed in Section 4.6. The fixed effects 4, 4 + ;’S, aj’S, &; — a; are, 
of course, estimated by 


and 
_ _ . -/ 
a —-a = V.—-y., tA, 


which are the same estimates as in the ‘“‘standard’’ model. However, the variances 
of the estimates will differ from those in the “‘standard”’ model. Thus, 


2 2 2 
0; + NO, + ano, 


Var(y_) = , 

(¥...) abn 

7 oa? + Node + nog 
Var(yj..) = — 

n 

_ oa? + Noss + noj 

Var(yij.) = —————— 
and 
2(o7 + noZ,) 

Var(yi.. — Vir.) = 2 


bn 


An unbiased estimate of Var(¥;..) can be obtained by using an appropriate linear 
combination of the mean squares. An approximate 1 — @ level confidence 
interval for 4 + a; can be constructed using 


a? + ni +né% 
y,.. £t[v, 1 —a/2] —_"—., 
n 


Two-Way Crossed Classification with Interaction 267 


where the degrees of freedom v will be estimated using the Satterthwaite pro- 
cedure. Similarly, an exact confidence interval for a; — a; is given by 


2MS 4B 
bn ’ 


and an exact confidence interval for a general contrast of the form )°"_, £;a; 
(jai €1 = 0) is Ly 2:57. tla — I(b — 1), 1 — @/2]./MS ap) <\_, bn. 
which 1s exactly the same as that given by (4.8.9). 

The “standard” model as well the new mixed model described here are spe- 
cial cases of the mixed model discussed by Scheffé (1956b; 1959, pp. 261— 
274). According to the Scheffé’s model, the observation Yijk 1S represented 
by 


Yi. — Yr. Eta — 1)(6 — 1), 1 — @/2] 


1=1,2,...,a 
Yijk = Mi tej YJ=1,2,...,b 
k=1,2, Nn, 


where m;,; and e;;, are mutually and completely independent random variables. 
Furthermore, mj; iS given by the linear structure: 


mij = h+ a; + Bj + (aB)i;, 


where 
E(mij) = "+a, 
with 
a ] a 
>> a; =0, B= = mij — H 
i=l i=] 
and 


> (@B)ij =0, jf =1,2,...,0. 
i=1 


Thus, for the Scheffé’s model the restrictions on (@B);;’s are the same as in 
the “standard” mixed model discussed in Section 4.2. The main difference be- 
tween them arises in specifying the variance-covariance structure of the random 
components 8; and (a@B)j;. 


Remark: Scheffé assumes that the vectors (B;, (@B)1;, (@B)2;,--., (@B)aj), j=1,...,b 
are independent multivariate normal vectors that satisfy the constraint }°;_,(a@B);; = 0 
for each j. This implies that (@B),;, (@B)2;,..., (@#B)a; are dependent on 6,. He further 


268 The Analysis of Variance 


defines a covariance structure for the {m;;} and it is possible to express the variances and 
covariances of the 6;’s and (@);;’s indirectly by stating the elements of this variance- 
covariance matrix. Thus, the two mixed models discussed in this chapter are rather 
special cases of Scheffé’s model. The analysis of the Scheffé’s model is similar to the 
“standard model” and the F tests for testing hypotheses H? :0; = O and H;¥ : oj, = 0 
are exactly the same as for the “standard” model. However, the distribution theory of 
MS, and MS,z is much more complicated; and, in general, the statistic MS,/MS az is 
not always distributed as an F variable when H, : a; = 0 is true. The only way to obtain 
an exact test for this problem is to consider it in a multivariate framework, which leads 
to Hotelling’s T* test (Scheffé, 1959, pp. 270-274). Scheffé avoids this procedure and 
instead suggests the use of the ratio MS4,/MSaz which can be approximated as an F 
variable with a — 1 and (a — 1)(b — 1) degrees of freedom. Another difference is that 
even though the hypothesis HA? : a2, = 0 may be tested by the statistic MS4g/MSz, 


a 


which under H4? has an F distribution with (a — 1)(b — 1) and ab(n — 1) degrees of 
freedom, the power is not expressible in terms of the central or noncentral F distribu- 
tion, since SS,z is not distributed as constant times a chi-square variable when HA? 
is false. The power of this test has been studied by Imhof (1958). For a discussion of 
multiple comparison methods for Scheffé’s mixed model, see Hochberg and Tamhane 
(1983). 


In view of numerous versions of mixed models, a natural question arises as 
to which model should be employed. Most people tend to favor the “standard 
model” and it is most often discussed in the literature. Furthermore, the results 
on expected mean squares under sampling from a finite population agree in form 
with those of the standard model (see Chapter [X). If the correlation values of 
the random components are not high, then either mixed model can be used 
and there are only minor differences between them. However, if the correlation 
values tend to be large, then Scheffé’s model should be preferred. The choice 
between different mixed models should always be guided by the correlation 
structure of the observed data and to what extent the correlations between the 
random components affect the characteristics of tests and estimation procedures 
of different mixed models. 


4.22 EFFECTS OF VIOLATIONS OF ASSUMPTIONS 
OF THE MODEL 


The list of assumptions for the model (4.1.1) is almost an exact parallel to the 
list of assumptions for models (2.1.1) and (3.1.1). Similar assumptions are made 
for more complex experiments entailing higher-order classifications. Thus, as 
one may anticipate, the same violations of assumptions are possible in the 
two- or multi-way crossed classifications as in the one-way classification. In 
this section, we briefly summarize some known results concerning the effects 
of violations of assumptions on the inference of the model (4.1.1). Further 
discussions on this topic can be found in Scheffé (1959, Chapter X) and Miller 
(1986, Chapter IV). 


Two-Way Crossed Classification with Interaction 269 


MODEL I (FIXED EFFECTS) 


For experiments involving balanced or nearly balanced designs, with relatively 
large numbers of observations per cell, the assumption of normality for error 
terms seems to be rather unimportant. However, for severely unbalanced ex- 
periments, the heavy-tailed or contaminated distributions may produce outliers 
and thereby distort the results on estimates and tests of significance. Thus, in 
an experiment, if the observations are suspected to depart from normality, then 
perhaps a balanced design with a correspondingly large number of observa- 
tions per cell should be used. Furthermore, if the data yield an equal number of 
observations in each cell, then the requirement of equal error variance in each 
cell, if violated, may not involve any serious risk. 

Krutchkoff (1989) carried out a simulation study to compare the performance 
of the usual F test along with a new procedure called the K test. The results 
indicated that the F test had larger type I error and decreased power. In designs 
involving unequal numbers of observations per cell, “... the size of the F test 
was inflated when the larger errors were on the cells with the smaller number 
of observations and deflated when the larger errors were on the cells with the 
larger number of observations.” In both situations, there was a decrease in 
power; but the drop was much more serious for the latter case. However, the K 
test was generally insensitive to the heterogeneity of variances. Consequently, 
there are two good reasons for planning an experiment with an equal number 
of observations per cell: the experimental design will be balanced leading to 
simple exact tests and the possible consequences of heterogeneous variances 
will be minimized. 

The assumption of independence seems to have major importance and its 
violation may lead to erroneous conclusions. (For a discussion of the prob- 
lem of serial correlation created by observations taken in time sequence, see 
Section 3.17.) For this reason great care should be taken in the planning and 
analysis of experiments involving repeated observations to ensure the indepen- 
dence of error terms. Thus, random assignment of experimental units to the 
treatment combinations 1s especially important. 


MODEL II (RANDOM EFFECTS) 


The lack of nonnormality in any of the random effects can seriously affect the 
distribution theory of the sum of squares involving them. The point estimates of 
variance components are still unbiased but the effects of nonnormality on tests 
and confidence intervals can lead to erroneous results. In particular, the tests 
and confidence intervals on o2 are very sensitive to nonnormality. However, 
the statistics MS,/MS,4g and MSg/MSazz, for testing the effects of variance 
components are somewhat robust. 

Very little is known about the effects of unequal variances and the lack of 


independence on the inferences for the two-way random model. 


270 The Analysis of Variance 


MODEL III (MIXED EFFECTS) 


There are very few studies dealing with the effects of violation of assumptions 
on the inferences for the two-way mixed model. For balanced or nearly balanced 
designs and moderate departures from normality, the effect of nonnormal ran- 
dom effects {e;;,}, (@B);i;, (Bj) on tests about the fixed effects {a;} is expected 
to be rather small. In the presence of appreciable nonnormality, however, the 
consequences could be more serious. In regard to tests and confidence intervals 
for the variance components, the effects of nonnormality may be extremely 
misleading. 

The problem of unequal variances could occur in terms of the variances of 
any of the random effects {e;;,}, {(@B);;}, and {B;}. For tests on {a;}, the effect 
of varying o? and O56 should be somewhat similar to the two-way fixed effects 
model (3.1.1) (with one observation per cell) since MS 4, is used in the denom- 
inator of the F test. The effect of varying a7, on testing the hypotheses concern- 
ing the variance components Ox, and Op» should be similar to the one-way model 
(2.1.1) since MS; is used in the denominators of the associated F statistics. 

Not much is known concerning the effect of lack of independence of the 
random effects {e;;,}, {(@B);;}, and {8;} on inferences for the two-way mixed 
model. 


EXERCISES 


1. An industrial engineer wishes to determine whether four different 
makes of automobiles would yield the same mileage. An experiment 
is designed wherein a random sample of three cars of each make is 
selected from each of three cities, and each car given a test run with 
one gallon of gasoline. The results on the number of miles traveled 
are given as follows. 


Make of Automobile 
City | il il IV 


Boston 244 23.6 27.1 22.6 
23.9 22.7 28.0 22.3 
25.55 22.9 274 23.6 
Los Angeles 25.7 24.2 25.1 24.5 
26.5 23.9 268 24.2 
25.4 246 248 25.3 
Dallas 23.9 23.7 27.3 24.4 
22.7 23.33 27.0 23.5 
25.1 24.8 266 24.1 


(a) Why was it considered necessary to include three cities in the 
experiment rather than just one city? 

(b) How would you obtain a random sample of three cars from a 
city? 


Two-Way Crossed Classification with Interaction 271 


(c) What assumptions are made about the populations, and what 
hypotheses can be tested? 

(d) Describe the model and the assumptions for the experiment. 

(e) Analyze the data and report the analysis of variance table. 

(f) Test whether there are differences in mileage among the cities. 
Use a = 0.05. 

(g) Test whether there are differences in mileage between makes of 
automobiles. Use a = 0.05. 

(h) Test whether there are interaction effects between cities and 
makes of automobiles. Use a = 0.05. 

2. An experiment is designed to compare the corrosion effect on three 
leading metal products. Eighteen samples, six of each metal, were 
used in the experiment and they were assigned at random into six 
groups of three each. The first three groups had densities (kg/mm?) 
taken after a test period of 30 hours and the next three groups were 
measured after a test period of 60 hours. The relevant data in certain 
standard units are given as follows. 


Metal Product 
Test Period (hrs) Steel Copper Zinc 


30 149 158 129 
126 129 124 
115 158 154 
60 152 112 126 
142 152 151 
124 117 138 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in corrosion effect among the 
three metal products. Use a = 0.05. 

(d) Test whether there are differences in corrosion effect between 
the two test periods. Use a = 0.05. 

(e) Test whether there are interaction effects between metal products 
and test periods. Use a = 0.05. 

(f) Determine a 95 percent confidence interval for a}. 

(g) Let a; be the effect of the i-th test period and 6; be the effect 
of the j-th metal product. Determine simultaneous confidence 
intervals for @; — @2 and B, — 3 using an overall confidence 
level of 0.95. 

3. A tool manufacturer wishes to study the effect of tool temperature and 
tool speed on a certain type of milling machine. An experiment was 
designed wherein two levels of tool temperature (300°F and 500°F) 
and four levels of tool speed (V;, V2, V3, and V4) were used, and three 
measurements were made for each combination of tool temperature 


272 The Analysis of Variance 


and tool speed. The relevant data on milling machine measurements 
in certain standard units are given as follows. 


Tool Tool Speed 
Temperature (°F) V; V> V3 V4 


300 4,783 5,720 5,185 5,530 
| 5,373 5,190 5,150 5,540 
5,383 5,523 5,397 5,155 

500 5,225. 5,837 5,131 5,493 
5,533. 5,180 5,290 5,341 

5,145 5,190 5,235 5,390 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in milling machine measure- 
ments among the four tool speeds. Use a = 0.05. 

(d) Test whether there are differences in milling machine measure- 
ments between the two tool temperatures. Use a = 0.05. 

(e) Test whether there are interaction effects between too] tempera- 
tures and tool speeds. Use a = 0.05. 

(f) Determine a 95 percent confidence interval for a2. 

(g) Let a; be the effect of the i-th tool temperature and 6; be the 
effect of the j-th tool speed. Determine simultaneous confidence 
intervals for @; — a2 and $B) — f3 using an overall confidence 
level of 0.95. 

4, Anexperiment was performed to determine the “active life” for three 
specimens of punching dies P;, P2, and P3 taken from seven punching 
machines, M,, M>,..., M7 inacertain factory. The relevant data on 
measurements in minutes are given as follows. 


Punching Machine 


Punching — 
Die M, M2 M3 Mg, Ms Me My; 

P, 35.8 37.7 368 369 39.3 37.4 41.3 
38.2 40.2 409 359 375 388 43.3 

P2 38.3 33.9 37.8 35.7 35.9 365 37.1 
36.1 37.2 385 33.1 37.3 383 364 

P3 38.7 39.7 38.9 373 369 38.1 40.0 


35.9 406 35.6 35.7 354 35.6 35.9 


(a) Describe the model and the assumption for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in “active life” among the 
seven punching machines. Use a = 0.05. 


Two-Way Crossed Classification with Interaction 273 


(d) Test whether there are differences in”’active life’ among the three 
punching dies. Use a = 0.05. 

(e) Test whether there are interaction effects between punching dies 
and punching machines. Use a = 0.05. 

(f) Determine a 95 percent confidence interval for 02. 

(g) Let a; be the effect of the i-th punching die and 8; be the effect of 
the j-th punching machine. Determine simultaneous confidence 
intervals for ~@, — @ and B, — f3 using an overall confidence 
level of 0.95. 

5. A production engineer wishes to study the effect of cutting tempera- 
ture and cutting pressure on the surface finish of the machined com- 
ponent. He designs an experiment wherein three levels of each factor 
are selected, and a factorial experiment with two replicates is run. 
The relevant data in certain standard units are given as 
follows. 


Temperature 


Pressure Low Medium High 


Low 51.5 51.8 51.3 
51.3 51.7 51.5 
Medium 51.2 51.6 49.9 
51.4 51.7 51.2 
High 51.6 51.9 51.5 
51.8 51.8 51.2 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the surface finish among 
the three levels of temperature. Use a = 0.05. 

(d) Test whether there are differences in the surface finish between 
the three levels of pressure. Use a = 0.05. 

(e) Test whether there are interaction effects between cutting tem- 
peratures and cutting pressures. Use a = 0.05. 

(f) Determine a 95 percent confidence interval for 02. 

(g) Let a; be the effect of the i-th pressure and 6; be the effect of the 
j-th temperature. Determine simultaneous confidence intervals 
for a — a2 and 6; — f3 using an overall confidence level of 0.95. 

(h) Evaluate the power of the test for detecting a true difference in 
pressure such that a a? = 0.10, where a; is the i-th level 
pressure effect. 

6. The following table gives the partial results of the analysis of vari- 
ance computations performed on the data of the life of five brands 
of plastic products used under five different process temperatures. 
Three plastics of each brand were used for each process temperature. 
Complete the analysis of variance table and perform the relevant tests 


274 


The Analysis of Variance 


of hypotheses of interest to the experimenter. Why was it thought nec- 
essary to include different process temperatures in the experiment? 
Explain. 


Source of Variation Sum of Squares 


Brand 311.23 
Temperature 321.34 
Interaction Lee 
Error 23.31 
Total 915.7 


7. Itis suspected that the strength of a tensile specimen Is affected by the 
strain rate and the temperature. A factorial experiment is designed 
wherein four temperatures are randomly selected for each of three 
strain rates. The relevant data in certain standard units are given as 
follows. 


Temperature (°F) 


Strain Rate 
(S*1) 100 200 300 400 
0.10 81 86 89 106 


91 75 95 111 
67 79 99 103 
0.20 109 105 106 111 
93 «111 115 107 
95 95 102 106 
0.30 106 =«111 115 111 
105 106—=«117 118 
109 102 #106 114 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the tensile strength among 
the levels of temperature. Use a = 0.05. 

(d) Test whether there are differences in the tensile strength between 
strain rates. Use a = 0.05. 

(e) Test whether there are interaction effects between levels of tem- 
perature and strain rates. Use a = 0.05. 

(f) Determine point and interval estimates of the variance compo- 
nents of the model. 

(g) Determine a 95 percent confidence interval for the mean differ- 
ence in response for strain rates of 0.10 and 0.30. 

(h) Analyze the data using the alternate mixed model discussed in 
Section 4.21 and compare the results obtained from the two mod- 
els. 


Two-Way Crossed Classification with Interaction 275 


8. A production control engineer wishes to study the factors that influ- 
ence the breaking strength of metallic sheets. He designs an experi- 
ment wherein four machines and three robots are selected at random 
and a factorial experiment is performed using metallic sheets from 
the same production batch. The relevant data on breaking strength in 
certain standard units are given as follows. 


Machine 
Robot 1 2 3 4 


1 112 113 «6111 «113 
113, «118 = 112)—s IT 
2 113. 113=«6«114~—s‘*i118 
115 114) 112~—Ssd117 


3 119 115 «117 123 
117) «118 =) 122)—Ss«119 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the breaking strength of the 
metallic sheets among machines. Use a = 0.05. 

(d) Test whether there are differences in the breaking strength of the 
metallic sheets between robots. Use a = 0.05. 

(e) Test whether there are interaction effects between machines and 
robots. Use a = 0.05. 

(f) Determine point and interval estimates of the variance compo- 
nents of the model. 

(g) Determine the power of the test for detecting a machine effect 
such that Of = o7, where op is the variance component for the 
machine factor and a? is the error variance component. 

(h) Suppose that the robots were selected at random, but only four 
machines were available for the test. Test for the main effects and 
interaction at the 5 percent level of significance. Does the new 
experimental situation affect either the analysis or the conclu- 
sions of your study? 

9. A production control engineer wishes to study the thrust force gen- 
erated by a lathe. He suspects that the cutting speed and the depth 
of cut of the material are the most important determining factors. 
He designs an experiment wherein four depths of cut are randomly 
selected and a high and low cutting speed chosen to represent the 
extreme operating conditions. The relevant data in certain standard 
units are given as follows. 


276 


10. 


Cutting Speed 


Low 


High 


The Analysis of Variance 


Depth of Cut 


0.01 


2.61 
2.69 
2.74 
2.77 


0.03 


2.36 
2.39 


2.77 
2.78 


0.05 


2.66 
2.77 


2.85 
2.79 


(a) Describe the model and the assumptions for the experiment. 
(b) Analyze the data and report the analysis of variance table. 
(c) Test whether there are differences in thrust force among depths 


of cut. Use a = 0.05. 


(d) Test whether there are differences in thrust force between cutting 


speeds. Use a = 0.05. 


(e) Test whether there are interaction effects between cutting speeds 
and depths of cut. Use a = 0.05. 
(f) Estimate the variance components of the model (point and inter- 


val estimates). 


A quality control engineer wishes to study the influence of furnace 
temperature and type of material on the quality of a cast product. 
An experiment was designed to include three levels of furnace tem- 
perature (1200°F, 1250°F, and 1300°F) for each of three types of 
material. The relevant data in certain standard units are given as 


follows. 


Material 1200°F 


1 79) 
779 
781 


2 761 
74] 
789 


3 757 
786 
799 


Temperature 


1250°F 


2191 
2198 
2196 


2181 
2145 
2111 


2156 
2164 
2177 


1300°F 


2493 
2491 
2497 


2439 
2423 
2399 


978 
1115 
999 


(a) State the model and the assumptions for the experiment. Assume 


that both factors are fixed. 


(b) Analyze the data and report the analysis of variance table. 
(c) Does the material type affect the response? Use a = 0.05. 
(d) Does the temperature affect the response? Use a = 0.05. 
(e) Is there a significant interaction effect? Use a = 0.05. 


Two-Way Crossed Classification with Interaction 277 


11. Crump (1946) reported the results of analysis of variance performed 
on the data from four successive genetic experiments on egg pro- 
duction with the same sample of 25 races of the common fruitfly 
(Drosophila melanogaster), 12 females being sampled from each 
race for each experiment. The observations were the total number of 
eggs produced by a female on the fourth day of laying. The mathe- 
matical model for this experiment would be 


1,2,3,4 
Yijk = A+; + Bj + (QB); + eije j=1,2,...,25 
k=1,2,..., 12, 


b 3 


where pz is the general mean, a; is the effect of the i-th experi- 
ment, 8; is the effect of the j-th race, (@B),; is the interaction of 
the i-th experiment with the j-th race, and the e ;;,’s are experimen- 
tal errors. It is further assumed that a; ~ N(0, 0), Bj ~ N(0, 03), 
(aB);; ~ N(O, O54); and that the @;’s, B;’s, (@B);;’s, and e;;,’s are 
mutually and completely independent. The analysis of variance com- 
putations of the data (not reported here) are carried out exactly as 
in Section 4.15 and the results on sums of squares are given as 
follows. 


Analysis of Variance for Genetic Experiments Data 


Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square F Value p-Value 


Experiment 139,977 
Race 77,832 
Interaction 33,048 
Error 254,100 
Total 504,957 


Source: Crump (1946). Used with permission. 


(a) Complete the remaining columns of the preceding analysis of 
variance table. 

(b) Test whether there are differences in egg production among dif- 
ferent experiments. Use a = 0.05. 

(c) Test whether there are differences in egg production among dif- 
ferent races. Use a = 0.05. 


278 The Analysis of Variance 


(d) Test whether there are interaction effects between experiments 
and races. Use a = 0.05. 

(e) Determine point and interval estimates for each of the variance 
components of the model. 

(f) Suppose that 25 races used in the experiment are of particular 
interest to the experimenter; and thus, this factor is considered 
to have a fixed effect. Perform tests of hypotheses and obtain 
estimates of the variance components under the assumptions of 
the mixed model. 

12. Box and Cox (1964) reported data from an experiment designed to 
investigate the effects of certain toxic agents. Groups of four animals 
were randomly allocated to three poisons and four treatments using 
a3 x 4 replicate factorial design. The survival times (unit, 10 hrs) of 
animals were recorded and the data are given as follows. 


Treatment 
Poison A B C D 


l 0.31 0.82 0.43 0.45 

0.45 1.10 0.45 0.71 

0.46 0.88 0.63 0.66 

0.43 0.72 0.76 0.62 

ll 0.36 0.92 0.44 0.56 
0.29 0.61 0.35 1.02 

0.40 0.49 0.31 0.71 

0.23 1.24 0.40 0.38 

Hl 0.22 0.30 0.23 0.30 
0.21 0.37 0.25 0.36 

0.18 0.38 0.24 0.31 

0.23 0.29 0.22 0.33 


Source: Box and Cox (1964). Used with per- 
mission. 


(a) State the model and the assumptions for the experiment. Assume 
that both poison and treatment factors are fixed. 

(b) Analyze the data and report the analysis of variance table. 

(c) Does the poison type affect the survival time? Use a = 0.05. 

(d) Does the treatment affect the survival time? Use a = 0.05. 

(e) Is there a significant interaction effect? Use a = 0.05. 

13. Scheffé (1959, pp. 140-141) reported data from an experiment de- 
signed to study the variation in weight of hybrid female rats in a foster 
nursing. A two-factor factorial design was used with the factors in 
the two-way layout being the genotype of the foster mother and that 
of the litter. The weights in grams as litter averages at 28 days were 
recorded and the data are given as follows. 


Two-Way Crossed Classification with Interaction 279 


Genotype of Foster Mother 


Genotype 
of Litter A F / J 

A 61.5 55.0 52.5 42.0 
68.2 42.0 61.8 54.0 
64.0 60.2 49.5 61.0 
65.0 52.7 48.2 
59.7 39.6 
F 60.3 50.8 56.5 51.3 
51.7 64.7 59.0 40.5 

49.3 61.7 47.2 

48.0 64.0 53.0 

62.0 

! 37.0 56.3 39.7 50.0 
36.3 69.8 46.0 43.8 
68.0 67.0 61.3 54.5 

55.3 

55.7 
J 59.0 59.5 45.2 44.8 


57.4 52.8 57.0 51.5 
54.0 56.0 61.4 53.0 
47.0 42.0 

54.0 


Source: Scheffé (1959, p. 140). Used with permission. 


(a) State the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table using 
the unweighted means analysis. 

(c) Analyze the data and report the analysis of variance table using 
the weighted means analysis. 

(d) Perform appropriate F tests, using the unweighted and weighted 
means analyses. Use a = 0.05. 

(e) Compare the results from the weighted and unweighted means 
analyses. 

14. Davies and Goldsmith (1972, p. 154) reported data from an experi- 
ment designed to investigate sources of variability in testing strength 
of Portland cement. Several small samples of a sample of cement 
were mixed with water and worked for a fixed time, by three differ- 
ent persons (gaugers), and then were cast into cubes. The cubes were 
later tested for compressive strength by three other persons (break- 
ers). Each gauger worked with 12 cubes which were then divided 
into three sets of four, and each breaker tested one set of four cubes 
from each gauger. All the testing was done on the same machine 
and the overall objective of the study was to investigate and quan- 
tify the relative magnitude of the variability in test results due to 


280 The Analysis of Variance 


individual differences between gaugers and between breakers. The 
data are shown below where measurements are given in the original 
units of pounds per square inch. 


Breaker 
Gauger 1 2 3 
1 5280 5520 4340 4400 4160 5180 
4760 5800 5020 6200 5320 4600 
2 4420 5280 5340 4880 4180 4800 
5580 4900 4960 6200 4600 4480 
3 5360 6160 5720 4760 4460 4930 


5680 5500 $620 5560 4680 5600 


Source: Davies and Goldsmith (1972, p. 154). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 
Would you use Model I, Model I, or Model III. In the origi- 
nal experiment, the investigator’s interest was in these particular 
gaugers and breakers. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in testing strength due to 
gaugers. Use a = 0.05. 

(d) Test whether there are differences in testing strength due to break- 
ers. Usea = 0.05. 

(e) Test whether there are interaction effects between gaugers and 
breakers. Use a = 0.05. 

(f) Assuming that the gauger and breaker effects are random, esti- 
mate the variance components of the model (point and interval 
estimates) and determine their relative importance. 


Three-Way and 
Higher-Order Crossed 
Classifications 


5.0 PREVIEW 


Many experiments and surveys involve three or more factors. Multifactor lay- 
outs entail data collection under conditions determined by several factors 
simultaneously. Such layouts usually provide more information and often can be 
even more economical than separate one-way or two-way designs. The models 
and analysis of variance for the case of three or more factors are straightforward 
extensions of the two-way crossed model. The methods of analysis of variance 
for the two-way crossed classification discussed in the preceding two chapters 
can thus be readily generalized to three-way and higher-order classifications. 
In this chapter, we study the three-way crossed classification in some detail 
because it serves as an illustration as to how the analysis can be extended when 
four or more factors are involved. Generalizations to four-way and higher-order 
classifications are briefly outlined. 


5.1 MATHEMATICAL MODEL 


Consider three factors A, B, and C having a, b, and c levels, respectively, and 
let there be n observations in each of the abc cells of the three-way layout. Let 
yijxe be the €-th observation corresponding to the i-th level of factor A, the j-th 
level of factor B, and the k-th level of factor C. Thus, there is a total of 


N =abcn 


observations in the study. The data involving a total of N =abcn scores yjjxe's 
can then be schematically represented as in Table 5.1. 

The notation employed here is a straightforward extension of the two-way 
crossed classification. As usual, a dot in the subscript indicates aggregation and 
a dot and a bar indicate averaging over the index represented by the dot. Thus, 


H. Sahai et al., The Analysis of Variance 281 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


The Analysis of Variance 


282 


udqDdK udTDK ud| DK ugTK UITTK UITK udqlK UdIZLK ULC 
CIGDK CITDK CI[PC C2dUTK CPCTK CIITA C2"GLK CPTI K CILIK 
[2qQDK¢ [2TDK [O[2¢ 129TK L92CCK LP1TK LPG ¢ LPT IPT 
UTqDK UTTOK UTILPK UCTITK UCTCT AK UTITA UCT UCTIA UTIIA 
CTIDK CCECPIK CTL9DK CCIUCK CCECEK CCITK CCOULA CECILIA CCILK 
17q9DK LOCK [C194 IC9CK [727K IZI1ZK¢ [791 ¢ IZCl¢ ICI1¢ 
ulqnog ULTDK UL [DK ULITK ULTTK ULITA ULqlL¢ UIT ULTLA 
CL9IPK CICPK CLIPG CIITA CICTK CLICK CIVIC CIZIA CIIIC 
L19D4K a vaae LL19¢ LI9CA LIZCA [LICK LI9VIl¢ LIZ1¢ aaare 
Iq tq lg V9 tq lg Iq tq lg 
y 10)9e4 


ID 


2) 


) 


2 sopey 


gq sopory 


[|2D 4ad suonedijday u YIM UOI}LdIJISSe]D passorDy ALAA-ddAY] & 40} PILG 


b's Davi 


Three-Way and Higher-Order Crossed Classifications 


283 


we employ the following notations for sample totals and means: 


and 


n 
Yijk. = S— vijee: 
l= 
Cc n 
Yij.. = y y Vijke» 
k=1 @€=1 
b n 
Vik. = y y Vijkes 


j=l =! 
a n 

Y.jk. = y ) Vijkes 
i=1 €=1 
b Cc n 

i = y y y Vijkes 
j=l k=1 €=1 


Vj. = > > > Vijke» 


Vijk 


= vijk./N; 


ij. = ij../Cn,; 


ik. = Yir./bn; 


ik. = Yjx./an; 


.. = Jy.../ben; 


;. = yj../acn,; 


k. = Y.x./abn, 


The analysis of variance model for this type of experimental layout is given 


as 


where 


is the general mean, 
a; is the effect of the i-th level of factor A, 
6; is the effect of the j-th level of factor B, 
Vx 1S the effect of the k-th level of factor C, 
(aB)i;, (@Y ik, (BY) jx are the effects of the two-factor interactions A x B, 
A x C, and B x C, respectively, 
(aBy)j jx 18 the effect of the three-factor interaction A x B x C, and 
€;;x 18 the customary error term. 


Vijke = UW +0; + Bj + YX + (HB); + (OY diz 
+ (BY )jk +(OBY Dijk + eijke 


284 The Analysis of Variance 


5.2 ASSUMPTIONS OF THE MODEL 
The assumptions of the model (5.1.1) are as follows: 


(1) @jjxe’S are uncorrelated and randomly distributed with common mean 
zero and variance o. 


(11) Under Model I, the a;’s, B;’s, ye’s, (@B)ij’s, (@Y)ix’s, (BY) jx’S, and 
(aBy )ijx’S are constants subject to the restrictions: 


a b Cc 
>> a; = Y- Bi = >> = 0, 
i=l j=l k=l 
a b a Cc 
Y- @B)ij = >) @B); = D> Vin = DS @Yix 
i=l j=l i=l k=l 


b c 
= > (By) jk = > (BY) ix = 9, 
j=l 


k=} 


and 


a b Cc 


Dd, BY ijt — > BY ijn — SY @BY ijk = 0. 


i=! j=l k=1 


(111) Under Model II, a;’S, B;'S, VS; (a@B);;’S, (ay )ix’S, (BY) jk’S; (aBy )ijk’S, 

and é;;x¢’s are mutually and completely uncorrelated random variables 
; 2,2 ,2 ,2 2 ,2 
with mean Zero and respective variances Oy,0%,0), O48, Sy,» FB, 

Oupy? and o>. 

(iv) Under Model III, several variations exist depending upon which factors 
are assumed fixed and which random. Suppose that factor A has fixed 
effects and factors B and C have random effects. In this case, a@;’s 
are constants; B;’s, ve’s, (@B)ij’S, (@Y )ix’S, (BY) jx’S, (ABY)ijx’S, and 
€ijxe’S are random variables with mean zero and respective variances 


2.2 2 .2 ,2 2 2 nh; “tions: 
05,97, 048, Syy1FB,> Top, and o; subject to the restrictions: 


Sa, — 0 (5.2.1) 
i=] 
YB); = DS @Vin =D OBY)ijx =9, 5.2.2) 
i=l i=] i=] 


for all j and k. 


Note that all interaction terms in Model III are assumed to be random, since 
at least one of the factors involved is a random effects factor. Furthermore, 
the sums of effects involving the fixed factor are zero when summed over the 
fixed factor levels. The correlations between random effects resulting from 


Three-Way and Higher-Order Crossed Classifications 285 


restrictions (5.2.2) can be derived, but are not considered here. Other mixed 
effects models can be developed in a similar fashion. For example, a model 
analogous to the two-way mixed model discussed in Section 4.21 involves 
the restriction (5.2.1) but not (5.2.2). This implies that the random effects are 
mutually and completely uncorrelated random variables.! 


5.3. PARTITION OF THE TOTAL SUM OF SQUARES 
As before, the total sum of squares can be partitioned by starting with the 
identity: 
Yijee — Y... = Wi. — VOI +O. — VD AO wd) 
+ (Vij. — Vi. — Vj. + Yd 
+ (Vik. — Vi. — Vik +...) 
+ (Vj. — VG. — Vouk. + YD 
+ (Vijk. — Vij. — Vik. — Yijk. ti. + Yj. + Vk AY...) 
+ (Vijne — Vijr.)- (5.3.1) 


Squaring each side and summing over i, j, k, and £, and noting that the cross- 
product terms drop out, we obtain 


SSr = SS4 + SSp + SSc + SSaep + SSac + SSac + SSasc + SSe, 


where 
a Cc n 


b 
SS; = 3 - (vijne — 9.) 
i=l j=l k 


=1 ¢=1 


SS4 = ben ) (yi. — 9... 
i=] 


b 
SSg =acn) (95. - 5...) 
j=1 


abn > (4~.—- 5.) 


a b 
SSap = "2 Gu. i. — 


SSc 


~ 
+ 
Ne 
uw” 
bo 


! For a more general formulation of the three-way mixed model as an extension of the two-way 
mixed model by Scheffé, see Imhof (1960). 


286 The Analysis of Variance 


SSac = bn > Yk. — Fi. — Fk +I 
i=l k=l 


b Cc 
SSac =an 3 YO. ik — Vij. Vik. + ys 
j=l k=l 
a b Cc 
SSasc =n > y- > ine. — Vij. — Vik. — Vijk. t Vi tj. +k yw), 
i=l j=l k=l 
and 
a b Cc n 
SSz = y° (vijke — ijn.) - 


Here, SS7 is the total sum of squares; SS,, SSg, SSc are the usual main 
effects sums of squares; SS4g,SSac,SSzc are the usual two-factor interaction 
sums of squares; SS, gc is the three-factor interaction sum of squares; and SS_ 
is the error sum of squares. 


Remark: In a three-way crossed classification, one can compute abc separate cell 
variances as eH 1 (vijke — Yi jk” which can then be tested for homogeneity of variances 
(see Section 2.21). 


5.4 MEAN SQUARES AND THEIR EXPECTATIONS 


As usual the mean squares are obtained by dividing the sums of squares by the 
corresponding degrees of freedom. The degrees of freedom for main effects and 
two-factor interactions sums of squares correspond to those for the two-way 
classification. The number of degrees of freedom for the three-factor interaction 
is obtained by subtraction and corresponds to the number of independent linear 
relations among all the interaction terms (@By)j;x’s. 

The expected mean squares are obtained in the same way as in the earlier 
derivations.* The results on the partition of the degrees of freedom and the sum 
of squares, and the expected mean squares are summarized in the form of an 
analysis of variance table as shown in Table 5.2. 


5.5 TESTS OF HYPOTHESES: THE ANALYSIS OF 
VARIANCE F TESTS 


By assuming the normality of the random components in model (5.1.1), the 
sampling distributions of mean squares can be derived in terms of central and 


2 One can use the algorithms formulated by Schultz (1955) and others to reproduce the results on 
expected mean squares rather quickly. See Appendix U for a discussion of the rules for finding 
expected mean squares. 


é Iss ] — “qn [RIOL 
ce fe) ?0 ig) a d q 
N Z z 4 7? “SW SS (I — 4)oqn JOU 
: 1I=y1=f l=! 
Madey _€ K€ 
“4? QxaxyV 
A 4 3 — 21 —q\| —0 
Pou + 79 (Pou + 70 “Pou + fo wn” + 70 DAVGWw DAVSS a-2q -ga —?) UONIBIDIU] 
l=y1=/ 1I=y1=f 
ADK KX x ADK Kx 
a ¢ | 
G-270-99) | agp 9 Ag 9 Ag Agn (i -— 21 - on 
——_—— + “Frou + 20 Foun + 70 goun + “Fou + 72 —,— + 70 DAsw Jags (1-2) — 4) UONIBINU] 
1=¥1=! 
WAn) ¢ "Cx 
n ( 7) D ) x y 
A A A A 1-21 — ») 
5 rouq + 70 pougq + Mou + 79 “Pouq + ou + 79 gn + 79 IVS IVSS (1-2) — 9) UOHIBIDIU] 
= : 1=f =! 
Y 204) kK 
= aq? 
7) axvy 
LF a) — — 
a: ous + 70 ous + “#Pou + 70 dou. + (Pou + 79 coo” + 70 aVow AVss (I- ga — ”) uUONDeIDIUT 
O . 
1 = An a A Ag a An Agn a ¥ 1-? 
2 Z A igo + 72d + 7? 7ouqn + oun + 7° zoud + 7ou + 7? 74 < uqp + 79 ISW ISs |-2 23 01 ang 
oO a, four + “doun + F 
_ fare Lad, gp a d Ad a do Ad yy 4 2 ty 1-4 
at 24 4 0D + zou? + 72 Zu ID + zoUuD + 72 zou? + out 72 74 DD + 70 ow {ss I- gq g 0 ang 
= l=! 
—D 
O jo ¢ 7 + “ougq + Pourg + APouq + 
h_ D . 
a) » > I=! |—D 
dn Agn d A ! 
fp pourg + 70 pour + “Hou + 70 zou + Pou + 72 z7 m3 + 70 Vow Vos | —0 y 0) ong 
L . 
O Paxty D pue g ‘wopuey y wopuey > pue g ‘paxiy y wopuey > pue ‘g ‘y paxty D> pue ‘g ‘y auenbs sasenbs wopso14 UOHEHEA 
Cc 
= IL POW 1 aPOW | POW ueaw jo wins jo saai3aq jo a04unos 
= auenbs uray paydadx3 
1 
b (L*L°S) [APOW 40} adURLIeA Jo siskjeuY 
i 
« CS JV 


288 The Analysis of Variance 


noncentral chi-square variables. The results are obvious extensions of the 
results given in Section 4.5 for the two-way classification. The tests for main and 
interaction effects can be readily obtained by the results of the sampling distri- 
butions of mean squares and their expectations. In the following we summarize 
the tests for fixed, random, and mixed effects models. 


MODEL | (FIXED EFFECTS) 


We note from the analysis of variance Table 5.2 that MS,, MSg, MSc, MSaz, 
MSac, MSzgc, and MSagzc all have expectations equal to o? if there are no 
factor effects of the type reflected by the corresponding mean squares. If there 
are such effects, each mean square has an expectation exceeding o7. Also, 
the expectation of MSzg is always a2 as was the case in the analyses of other 
models. Hence, the tests for factor effects and their interactions can be obtained 
by comparing the appropriate mean square against MSz; the large value of 
the mean square ratio indicating the presence of the corresponding factor or 
interaction effect. The development of various test procedures follows the same 
pattern as in the case of two-way classification. In Table 5.3, we summarize 
the hypotheses of interests, corresponding test statistics, and the appropriate 
percentiles of the F distribution. 


Remarks: (i) An examination of Table 5.2 reveals that if n = 1 there are no degrees 
of freedom associated with the error term. Thus, we must have at least 2 observations 
(n > 2) in order to determine a sum of squares due to error if all possible interactions 
are included in the model. For further discussion of this point, see Section 5.10. 

(ii) If some of the interaction terms are zero, one may consider the possibility of 
pooling those terms with the error sum of squares. However, as discussed earlier in 
Section 4.6, the pooling of nonsignificant mean squares should be carried out with 
a great deal of discretion and not as a general rule. The pooling should probably be 
restricted to the mean squares corresponding to the effects that from prior experience 
are unlikely to yield significant results (not expected to be appreciable). 

(iii) As in the case of two-way crossed classification, it may be of interest to consider 
the significance level associated with the experiment as a whole. Let aj, a2, ..., a7 be 
the significance levels of the seven F statistics, F4, Fp,..., Fasc, respectively, and 
let a be the significance level comprising all seven tests. Then again it follows that 
a <1—T1/_,(1 —a;). For example, if a} =a2 = --- =a7 =0.05, then w < 0.302; and 
ifa) =a, = --- =a7=0.01, then a < 0.068. 


MODEL II (RANDOM EFFECTS) 


The appropriate test statistics for various hypotheses of interest can be deter- 
mined by examining the expected mean squares in Table 5.2. However, for the 
first time, we encounter the difficulty that even under the normality assump- 
tion exact F tests may not be available for some of the hypotheses usually 
tested. There is no difficulty about testing the hypotheses on the three-factor 
or the two-factor interactions. Thus, from the expected mean square column of 


Three-Way and Higher-Order Crossed Classifications 289 


TABLE 5.3 
Tests of Hypotheses for Model (5.1.1) under Model I 
Hypothesis Test Statistic Percentile 
H$ : alla; =0 MS 
versus Fa=T F[a—1, abc(n — 1); 1-—a] 
HA : not alla; = 0 E 
Hy : all B; =0 MS 
versus Fp= nis F[b—1, abc(n— 1);1-@] 
H? : not all B; = 0 E 
HE : all yz, =0 MS 
versus Fo=—-£ ss Fle—1, abe(n —1);1 —@] 
Hy, :notall y, =0 
HS : all (aB);; = 0 MS 
versus Fap= vs F{(a— 1)(b—1), abe(n — 1);1 — a] 
Hf® : not all (af); ; = 0 E 
HEC all (ay);, =0 MS 
versus Fac= i F[(a— 1)(c — 1), abe(n — 1); 1 —@] 
HAC : not all (ay);, =0 E 
Hp : all (By) j~ =0 MS 
versus Fec= vr F[(b— 1)(c — 1), abe(n —1);1 — a] 
HBC : not all (BY) jk = 0 E 
ABC . —— 
HBC : all (aBy); jx = 0 MSapc 


versus FaBc = —_—s— F(a — 1)(6 — 1)(c — 1), abc(n — 1); 1 — a] 
ABC . MSE 
Ay :notall (@By); i, = 0 


Table 5.2, we see that the hypothesis Hj'7° : Oc py =0 versus H/*?° : Onpy > 0 


can be tested with the ratio MS 4gc/MSz; and H;'? : O%p = Oversus H/'? : O%p > 


O with MS4g/MS,zc,andso on. Now, suppose that we wish to test H;' : 02 = 0 
versus Hj‘: 02 > 0 (cases Hj’ and H¥ can, of course, be treated similarly). If 
we are willing to assume that 07, = 0, then an exact F test of Hj‘ can be based 
on the statistic MS,/MSzac. In this case, SS,4g could be pooled with SS,jgc 
since they would have the same expected mean squares. Similarly, if we are 
willing to assume that o/,, = 0, we may test Hj! with MS,/MSaz and pool 
SSac with SSagc. Furthermore, if we are willing to assume other variance 
components to be zero, there would be no difficulty in deducing exact tests, 
if any, of the standard hypotheses and pooling procedures obtained from the 
analysis of variance Table 5.2, by deleting in it the components assumed to be 
zero. However, if we are unwilling to assume that 02, = Ooro2, = 0, then no 
exact test of Hj‘ can be found from Table 5.2. 


An approximate F test of H;' can be obtained by using a procedure due 
to Satterthwaite (1946) and Welch (1936, 1956). (For a detailed discussion 


Y 


290 The Analysis of Variance 


of the procedure, see Appendix K.) To illustrate the procedure for testing the 
hypothesis | 


versus (5.5.1) 


we note from Table 5.2 that 
E(MS,z) + E(MSac) — E(MSagc) = 02 + cng + bri, + NO py 


which is precisely equal to E(MS,) when o2 = 0. Hence, the suggested F 


(04 
statistic 1S 


MS, 


=A (5.5.2) 
MS ae + MSac — MSaac 


Fa 


which has an approximate F distribution with a — 1 and v, degrees of freedom, 
where v, 1s approximated by 


(MSas + MSac — MSazc)* 
Yq = 48 7 ac ~ ape’ 65,5,3) 
(MS 42) (MS 4c) (MS 48c) 


(a—1\(b-1) (a-—1)e-1) (a-1)6-1)\(e—-1) 


Remarks: (i) Because of the lack of uniqueness of the approximate F ratio (different 
F ratios may result from the use of different linear combinations of mean squares) and 
because of the necessity of approximating the degrees of freedom, the procedure is of 
limited usefulness. However, if used with care, the test procedure can be of value. The 
reader is referred to Cochran (1951) for a detailed discussion of this problem. 

(ii) Usually, the degrees of freedom given by (5.5.3) will not be an integer. One can 
then either interpolate in the F distribution table, or round to the nearest integer. In 
practice, the choice of the nearest interger will be more than adequate. 

(iii) An alternative test statistic for testing the hypothesis (5.5.1) is 


Fl MSa + MSasc 
A MSap + MSac 


The approximate degrees of freedom for both the numerator and the denominator are 
obtained as in (5.5.3) using Satterthwaite’s rule. Because of the need to estimate only 
the denominator degrees of freedom, the test criterion (5.5.2) might be expected to 
have better power but it suffers from the drawback that the approximation (5.5.2) is 
less accurate when the linear combination of mean squares contains a negative term. 
Moreover, the denominator of the test statistic (5.5.2) can assume a negative value. The 
problem, however, may be less important if the contribution of MS 4, is relatively small 
and the corresponding degrees of freedom are large. The reader is referred to Cochran 
and Cox (1957), Hudson and Krutchkoff (1968), and Gaylor and Hopper (1969) for some 
further discussions and treatment of this topic. The general consensus seems to be that 


Three-Way and Higher-Order Crossed Classifications 291 


the two statistics are comparable in terms of size and power performance under a wide 
range of parameter values (see, e.g., Davenport and Webester (1973); Lorenzen (1987)). 

(iv) An alternative to the Satterthwaite approximation for estimating the degrees of 
freedom for F4 and F’, has been proposed by Myers and Howe (1971), but the procedure 
has been found to provide a liberal test (Davenport (1975)). 


(v) An alternative to an approximate F test of Hs 02 = () has been proposed by 


Jeyaratnam and Graybill (1980) which is tied to the lower confidence bound of a2. For 
a discussion of some other test procedures for this problem, see Naik (1974) and Seifert 
(1981). Birch et al. (1990) and Burdick (1994) provide results of a simulation study to 
compare several tests for the main effects variance components in model (5.1.1). 


To obtain a procedure for testing the hypothesis 


Hy Of =0 
versus 
H? OB > 0 


interchange A and B (also a and D) in (5.5.2) and (5.5.3). Similarly, for the 
hypothesis 


versus 
Hy: a, > 0 


interchange A and C (also a and c) in (5.5.2) and (5.5.3). 

Finally, similar to Table 5.3 for Model I, the hypotheses of interests, corre- 
sponding test statistics, and the appropriate percentiles of the F distribution are 
summarized in Table 5.4. 


MODEL III (MIXED EFFECTS) 


Suppose that A is fixed and B and C are random. The approximate F test given 
by (5.5.2) is used to test 


Ho:a; =0, i=1,2,...,a 
versus 
H, : not all a@;’s are zero. 


The other six F ratios test the following principal null hypotheses: 


2 2 2 2 2 2 
og =0, o* = 0, Cup = 9, oo”, =), og, = 0, and Oupy = 9. 


The results are summarized in Table 5.5. If B is the fixed factor and A and C 
are random, then interchange A and B (also a and bD) in Table 5.5. Similarly, if 


292 The Analysis of Variance 


TABLE 5.4 

Tests of Hypotheses for Model (5.1.1) under Model Il 
Hypothesis Test Statistic* Percentile 
HS : o2 = 0 


MSa 


F,=———_—__—-——_ Fla —1,v4;1-—a] 
MSas + MSac — MSasc 


versus 


H® :o2=0 
versus. F, = ————es F[b—1, 31 —a] 
B 2 MSas + MSzgc — MSasc 
Hy, 10g > 0 
Ho :o2=0 
versus. F.=——__MBe_ F[c—-—1,v,.31-—a] 
a MSac + MSgc — MSasc 
A, 1 Oy > 0 
HE? - 02, =0 Ms 
versus Faz =-—— Fi(a — 1)(6 — 1), (a — 1)(b — 1c — 1);1 —@] 
AB . ~2 MS asc 
Ay” : Oup > 0 
HEC Oey =0 
MS 
versus Fac = Mis AC F((a — 1)(c — 1), (a — 1)(b — 1)(c — 1);1 — a] 
HY : oR, >0 ABC 
HEC : Opy = MS 
versus Fac= BC F{(b — 1)(c — 1), (a — 1)(b — 1)(c — 1); 1 — @] 
BC . 2 MSaac 
Hy”: Of, > 0 
HABC . 62 =O 
0 Papy MSasc 
versus Fiasc = ——— F{(a — 1)(b — 1)(c — 1), abc(n — 1); 1 -— @] 
HABC - 62, sO MSe 
* “apy 


* For the test statistics F4, Fg, and Fc, the denominator degrees of freedom vg, vp, and v¢ are 
obtained using the formula (5.5.3) and its obvious analogues for vp, and ve. 


C is fixed and A and B are random, then interchange A and C (also a and c) in 
Table 5.5. 

Next, suppose A is random and B and C are fixed. The results of all the 
principal hypotheses of interest are summarized in Table 5.6. Note that all the 
F tests are exact and no approximate tests are necessary. If the random factor 
is B, interchange the role of A and B (also a and b) in Table 5.6. If the random 
factor is C, interchange the role of A and C (also a and c) in Table 5.6. 


5.6 POINT AND INTERVAL ESTIMATION 


No new problems arise in obtaining unbiased estimators of variance components 
for random effects factors or in the estimation of contrasts for fixed effects 
factors in Models I or III. Confidence limits for contrasts for fixed effects 
factors are constructed by using the mean square employed in the denominator 
of the test statistic while testing for the effects of that factor. The degrees of 
freedom correspond to the mean square used in the denominator. The results 
on point estimation for Model I are summarized in Table 5.7. 


Three-Way and Higher-Order Crossed Classifications 293 


TABLE 5.5 
Tests of Hypotheses for Model (5.1.1) under Model III 
(A Fixed, B and C Random) 


Hypothesis Test Statistic* Percentile 


Hé : alla; =0 
versus 
HA : not alla; — 0 


MS, 


Fy = 4 Fa = 1, ys - 
A MSagsp + MSac — MSasc la “a a] 


He : of =0 
MSs 
versus Fe= 55 F(b—1,(b —1)(c — 1); 1 -—a@] 
HP : of >0 BC 
Ho 2 _ 
0° MSc 
versus Fo= a5 Fl(c — 1), (b —1)(c — 1); 1 — a] 
HE : 0, >0 BC 
Hg? : 02, =0 
versus FaB = A65 F{(a — 1)(b — 1), (a — 1)(b — 1)(e — 1); 1 — @) 
HA8 On, > 0 ABC 
AC. ,2 _ 
Ay 3 Og, = MSac 
versus Fac = M55 F((a — 1)(c — 1), (a — 1)(b — 1)(c — 1);1 — @) 
HAC -02, >0 ABC 
: Ony 
BC. ,2 _ 
Hy~: of, = 0 MSpc 
versus Fac = MS F[((b — 1)(c — 1), abce(n — 1); 1 — a) 
HBC Of > 0 E 
* “By 
ABC. 72 _ 
Hy”  : Og, =9 MS age 
versus Fasc = MS F((a — 1)(b — 1)(c — 1), abc(n — 1); 1 — @) 
HAs : Ors > 0 E 
: Oxpy 


* For the test statistic F4, the denominator degrees of freedom vg is determined using the formula 
(5.5.3). 


An exact 100(1 — a) percent confidence interval for a2 1S 


abc(n — 1)MSgz , abc(n — 1)MSe 
TTT SS ST (5.6.1) 
x“[abce(n — 1), 1 — a/2] x“Labc(n — 1), a@/2] 
Confidence intervals for other parameters under Model I are obtained from 
the corresponding items in columns (2) and (3) of Table 5.7. For example, for 
making pairwise comparisons, Bonferroni intervals are given by 


2MS¢£ _ 7 2MSeE 
< Oj — Oy < Vi. — Yr +6 


~l—aq@, 
ben Ft ben * 


(5.6.2) 


6 =tlabc(n — 1), 1 —a/2m] 


294 The Analysis of Variance 


TABLE 5.6 
Tests of Hypotheses for Model (5.1.1) under Model III 
(A Random, B and C Fixed) 


Hypothesis Test Statistic Percentile 
HA . o2 —0 
0 "% MS 
versus F,= Se Fla —1, abe(n—1);1-@] 
HA : a2 >0 E 
B. — 
Hg :all pj; =0 MS p 
versus FB= 356 F(b—1,(a—1)\(b—-1);1-a] 
HP : not all B; =0 AS 
C 
Hy :ally, =0 
0 MS 
versus Fo= Te C  Ffe-1,(a—1)(e-1);1-a] 
He :notall y, =0 AC 
HAB -g2, =0 
0 . ap MS 
versus Fap= iS F{(a — 1)(b — 1), abe(n — 1);1 — a] 
AB. 2 E 
A; : OnR > 0 
HAC - 62. =0 
0 Cay _ MSac . 
versus Fac= F{(a — 1)\(c — 1), abc(n — 1); 1 -—a@] 
HAC - 62 50 MSE 
1 ‘Cay 
HB© : all (By) jx = 0 Ms 
versus Feo = — P= OF(b- 1c — 1), (a — D(H We — 1); 1 — a] 
BC MS 4BC 
H, ~ :notall (By) j, =0 
HABC .42, =0 
0 oBy _ MSasc 
versus FaBC= F{(a — 1)(b — 1)(c — 1), abc(n — 1); 1-@] 
HABC 52) 59 MSE 
1 " “apy 


is the t value with m being the number of intervals constructed. Similarly, 
100(1 — a) percent Bonferroni intervals for the contrast 


L= Sia; (yr = o 
i=] i=] 


are determined by 


MSe 
bcn 


MS; < 
— \ > é?. (5.6.3) 


a 
yi <L<L+é 
ben 


i=l i=l 

Under Model III, with A fixed and B and C random, one 1s typically interested 
in contrasts of the form )>7_, £ja;()_;_, £; = 0); however, no exact interval for 
a linear contrast is available. To see this note that }“7_, £:@; = >-j_, €: 54... with 


Three-Way and Higher-Order Crossed Classifications 295 


TABLE 5.7 
Estimates of Parameters and Their Variances under Model I* 
Parameter Point Estimate Variance of Estimate 
~ 2 
sad Yue. ag /abcn 
aj We. Yi. (a — l)o2/abcn 
b+ aj Yi... o¢/ben 
+a; + Bj + (a8); Vij. af/cn 
M+a;+By+y+(aB)ijy — Vijk. of /n 
+ (ay )ik + (BY) jk 
+ (aBy )ijk 
a; — a; Vi... — Wi... 202/ben 
a a a a 5 5 
Sei; (34 -0) De Gi... (4) ae /ben 
i=] i=] i=] i=l 
(a8); ; Vij. — Vi. — Ij +S... (a — 1)(b — 1)02/aben 
(aBy)i jx Vijk. — Vij. — ik. — Yi jk. (a — 1)(b — 1)(c — 1)02/aben 


+ Yi FY FH YLK. AY... 
a2 MSE 207 /abc(n — 1) 


* Estimates for B j H+ B;, (@y)jx, and other parameters not included in Table 5.6 can be obtained 
by interchanging appropriate subscripts in the estimates shown in the table. 


Var(> 5-1 fi 91.) = Ol ?)(02 + NO cp, + cno gg +bnoj, )/(ben). Thus, Var 
(>-5_, 4 yi...) cannot be estimated using only a single mean square in the analy- 
sis of variance Table 5.2. Approximate intervals can be based on Satterthwaite 
procedure and a method due to Naik (1974). For further discussion of these 
and other related procedures including a numerical example, see Burdick and 
Graybill (1992, pp. 156-160). If two of the effects are fixed and one random, 
we have seen that exact tests exist for all the hypotheses of interest. Thus, 
exact intervals can be constructed for all estimable functions of fixed effect 
parameters. For example, with A and C fixed and B random, it can be shown 
that 


(a — l)o2 2 
SOB Oe (5.6.4) 


2 
Var(5j..) = — + 
_ b ab ben 


and 


Var(¥j... — Yi...) = —————. (5.6.5) 


296 The Analysis of Variance 


Remark: The results (5.6.4) and (5.6.5) can be derived as follows. First, since all the 
random and mixed terms are independent of each other so their cross-products will 
have expected value equal to zero. Further, it follows that Var{(@);;}= E [(aB);,] = 
[(a — l)/aloz, and E = [(@B);;(@B)j’ ;;] =9 for j #j’. Similar results hold for (By) jx 
and (@By)i jx. Now, E(yj...) = « +0; and Var(yj;..__) is given by 


Var(¥;...) = ELyi... — w — aj]° 
1 b Cc n 


E bon » > CF + (@B)ij + VE + (OY dik 


j=l k=) ¢=1 


2 


+ (By )jk + (QBY ijk + eijxe) 


l b l b l b Cc n 2 
=— £E 5 Bi + 5 a OB) + ben 222 Cj jke 
_ 6b 4, (a-1)b 4 ben 4 
— 52° ab. 8 * b2c2p2e 
2 2 
b ab bcn 
Similarly, since E[(@B)ij(@B)i'j] = —o3,/a and E[Y...— yr.) = a- ay, 
Var(Yj... — Yi...) iS given by 
Var(5j... — Ji...) = EDK... — a) — Fv.. tay? 
] b ] b l b Cc n 
=El 5 OB — 5 Br +5 De De Deine 
j=l j=l j=l k=1 €=1 


2 
Cc 


] b n 
LEY Vee 
ben 2 pai a1 


b b 
2E ( A) (s Aes) 
2a—-1)b , j=} =| n 20? 


— be St __e. 
ab2 Pop b2 ben 


2(a — lose 2bose Io2 
ab + ab? ben 
2(o? + cno2, ) 


ben 


The analysis of variance estimates of the variance components are readily 
obtained from Table 5.2 by setting the mean squares equal to the expected 
mean squares and solving for the desired variance components. For example, 


Three-Way and Higher-Order Crossed Classifications 297 


under Model III with A fixed and B and C random, the estimates of variance 
components are: 
6? =MSz 

6 ig, = (MSasc — MSz)/n, 
63, = (MSac — MSz)/an, 
By = (MSac — MSazc)/bn, 

ap = (MSag — MSazc)/cn, 
= (MSc — MSzc)/abn, 


and 
= (MSz — MSzc)/acn. 


For some results on confidence intervals for individual variance components 
and certain sums and ratios of variance components under Models II and III, 
including numerical examples, see Burdick and Graybill (1992, pp. 131-136, 
156-160). 


5.7 COMPUTATIONAL FORMULAE AND PROCEDURE 


Ordinarily, computer programs will be employed to perform the analysis of vari- 
ance calculations involving three or more factors. For completeness, however, 
we present the necessary computational formulae: 


SSr => » » Ysa = 


SS, = a > y2 — ye 
A ben — abcn 
1 2 y? 
SS — 2 _ a 
*  acn d Yj. abcn 
1 y? 
SS — —_ 2 __ eons , 
c abn > Yk. abcn 


NM 
NM 
> 
& 
II 
oP 
g|- 
Ms 
Me 
<< 
aN 
| 
S| 
xy — 
= 
~™ ON 
| 
S| 
oP 
= 
nw, NM 
-+- 
<< 


298 The Analysis of Variance 


SSec = — _ —— _ 
BC an » — Yk ~ Gen » J abn k- "-abcn 
j=l k=1 j=! k=1 
] a b Cc ] a 5 
2 
SSaBc = n » Vijk. — Tp » yi. md Yok. 
i=1 j=1 k=! i=1 j=l 
] b Cc ] a b 
2 2 2 
_ 4+ 24 
an dd, ¥ik cn d» acn Yi 
l< 2 y?. 
abn I Yak abcn 
and 
a b c n 1 a b c 
2 2 
SSze = > Yijke — n > Yijk 
i=1 j=l k=1 @=1 i=1 j=1 k=! 


Alternatively, SS- can be obtained from the relation: 
SSe = SSr — SS, — SSB — SSc — SSae — SSac — SSae — SSasc.- 


Remark: The preceding computational formulae can be readily extended if four or 
more factors are involved. In Section 5.11, we illustrate some of these computational 
formulae for the case of the four-factor crossed classification model. 


5.8 POWER OF THE ANALYSIS OF VARIANCE F TESTS 


Under Model I, the power of each of the F tests summarized in Table 5.3 can be 
obtained in the manner described for the one-way and two-way classification 
models. The normalized noncentrality parameter ¢ needed for calculating the 
power of each F test can be obtained as follows: 


$ ] [meee of Second Term in the Expected Mean Squares Column in Table = 1/2 


Oe Corresponding Degrees of Freedom + 1 


For example, for testing the null hypothesis that the three-factor interaction 
ABC 1s zero, we have 


1/2 


~ go, | (a— 1b —1(c—-1) +1 


Three-Way and Higher-Order Crossed Classifications 299 


For the two-factor interaction AB, 


1/2 


a b 
cn), d (af); 

bas = i=1 j=l 
486. | (a—1(b—1) 41] ” 


and for the main effect A, 


and so on. 
In an analogous manner, under Model II, for testing the null hypothesis that 
the three-factor interaction ABC is zero, we have 


> 41/2 
NO sp, 
ABC = [ + re 


For the two-factor interaction AB, 


For testing the main effects, we have seen that no exact F tests are available. 
An approximate power of the pseudo-F tests discussed in Section 5.5 may, 
however, be computed (see, e.g., Scheffé (1959, p. 248)). 


5.9 MULTIPLE COMPARISON METHODS 


As before, multiple comparison methods can be utilized for the fixed as well as 
mixed effects models to test contrasts among cell means or factor level means. 
We briefly indicate the procedure for the fixed effects case. 

When the null hypothesis about a certain main effect or interaction is re- 
jected, Tukey, Scheffé, or other multiple comparison methods may be used to 
investigate specific contrasts of interest. For example, if H;‘?° is rejected, we 
may be interested in comparing the contrasts of cell means 


ijk = +0; + By + VE + (OB); + ON Din + (BY) jk + COBY Di jx: 


300 The Analysis of Variance 


Then multiple comparison methods may be used to investigate the general 
contrasts of the type 


a b c 
L= > > Y > lijk ijt: 
i=] j=l k=] 


where 


i-1 j=l k=1 
An unbiased estimator of L 1s 
a b Cc 
L= y ijk Vijk.» 
i=) j=l k=! 
for which the estimated variance 1s 
E a b Cc 
—™ oA _ 4 
Var(L) = ) e; ik 
i=1 j=l k=l 


For the Tukey’s method involving pairwise comparisons, we have 
T = qlabc, abc(n — 1);1-— a]. 
For the Scheffé’s method involving general contrasts, we would have 
S? = Flabe — 1, abc(n — 1);1 — @]. 


Furthermore, if H3 is rejected, one may proceed to investigate contrasts 
involving a@;’s of the form 


L= S £50; , 
i=] 


where 


Again, an unbiased estimator of L 1s 


a 
L= 3 li Yi... 
i=l 


Three-Way and Higher-Order Crossed Classifications 301 


with estimated variance 
—~,. MSeG, 
Var(L) = —— | on 
(f) ben » ; 


For the Tukey’s procedure involving pairwise differences, we have 
T = qla,abc(n — 1);1—- a]. 
For the Scheffé’s procedure involving general contrasts, we would have 
S* = Fla — 1, abc(n — 1);1—a’]. 


Contrasts based on £;’s and y,’s can be investigated in an analogous manner. 


5.10 THREE-WAY CLASSIFICATION WITH ONE 
OBSERVATION PER CELL 


If there is only one observation per cell in model (5.1.1) (.e., 2 = 1), we can- 
not estimate the error variance o? from within-cell replications. In this case, 
analysis of variance tests can be conducted only if it is possible to make an 
additional assumption that certain interactions are zero. Usually, we would 
assume that there is no three-factor interaction A x B x C. If it is possible 
to assume that the A x B x C interaction is zero, then the corresponding mean 
square MS,zc has expectation o? and can be used as the error mean square 
MS, to estimate the error variance 07. However, this layout does not allow 
separation of the three-factor interaction term from the within-cell variation or 
the error term. 
The analysis of variance model in this case is written as 


i=1,2,...,a 
Vijk = M+ aj; + Bi + VE + (aB)i; j=1,2,...,b (5.10.1) 
+ (ay ik + (BY) jx + eije b-12.... ¢ | 


All sums of squares and mean squares are calculated in the usual manner except 
that now n = 1. The definitional and computational formulae for the sums of 
squares are: 


SSa = be Si, -y.y= - Sy — —y?, 
i=l j 


be a abc 
? 2 | , 2 ] 2 
SS = y;—y = —_ ; 
B=ac D (Vi. — Y..) - D Yi. The 
SS = ab 05 —¥ ——— y? _ y? 
.K wee ab .K abc wee? 


k=] k=1 


302 The Analysis of Variance 


l = : 2 I = 2 I 2 2 
arp Dp Dean ro Deere See ane 
i=] j=l i=] jJ=1 
a Cc 
SSac =D) Yi — Fi. — Fe +I 
i=] k=1] 
=5L Doe et ae 
Be aR be La gh Lak ae 
b c 
SSsc =a) jk — 9.5. —Fa tI, 
J=1 k=! 


lI 
Q | — 
Me 
Ne 
ww, 
o 
| 
Q 
a le 
N< 
~N 
| 
S| - 
M4: 
Ne 
ho 
o 
S| 
a 
S 
Ve 
2 ON 


jJ=1 k=!1 j=! k=] 
and 

a b c 

SSz = > (Vijk — Vij. — Vik — Dijk +I. FIG +I AIL 
i=] j=l k=1 
a b Cc ] a b ] a Cc ] b Cc 

2 2 2 2 
_ » Yijk ~ 7 » ip dy Dik vik 

as f ct 4 b 4 a + 
i=] j=l] k=] i=] j=l i=] k=1 J=1 k=!1 


be — Yi. ac ate ab L k abe. 


The resulting analysis of variance table is shown in Table 5.8. In the case of 
Model I, all mean squares are tested against the mean square for error (three- 
factor interaction). When the null hypothesis about a certain main effect or a 
two-factor interaction is rejected, Tukey, Scheffé or other multiple comparison 
methods may be used to investigate contrasts of interest. Under Models II 
and III, the appropriate test statistics for various hypotheses of interest can 
be determined by examining expected mean squares in Table 5.8. However, 
again, there are no exact F tests for testing the hypotheses about main effects 
under Model II. Pseudo-F tests discussed earlier in Section 5.5 can similarly 
be developed. 


5.11 FOUR-WAY CROSSED CLASSIFICATION 


The analysis of variance in a four-way classification is obtained as a straight- 
forward generalization of the three-way classification and we discuss it only 


S “SS | — 9qD [BIOL 
Se) 
(10119) 
(I — 9)x DIxdIxy 
70 70 70 79 (AV OW ISS (I - @— 2) uonoela}Uy 
1=y 1=f 1=¥ l= 
"Ad KK x "AD KK x 
2 gq a 4 
—~ II - — o\1— IX 
G == xt Ll + 70 “tov + 70 “Pov + 70 = 91 = 4) at D + 70 DIIGw D4Gs (I-2\1-4) uonoRIaUy] 
1=7 1=! 
"W(Av) ¢ << x 
2 D 
an An a An a An a (1 — 2)(1 — 0) a av oy xv 
5 294 + 79 294 + 70 294 + 70 nr an + 70 SIN SS (I — 9)([ — D) uonoeI}U] 
= =f 1=! 
oS i 
2 its”) < x 
wo a —D qx Vv 
a #09 + 20 #99 + 20 #09 + 20 cao”) + 70 avon VSs (1-9 — 2) uonoeIO}U] 
U 
oS S1-? a» a A Ad a A Ad An a I-97 , 
© 2A << q0 + “P0q + 70 7oqv + “Fon + vo pogo + “Sov + “70g + 70 oA “<< qo + 70 ISW ISs i 3 0) ang 
72) re) a 
: > 1-4 1-4 
U fd < —— + #09 + 20 foov + “Sov + 20 Foon + “dov + 09 + 20 fd _< = + 70 aon fics 1-4 g 0) anq 
oD) 4 1=! q 
Tb ! [—D 
= Fé) + 
9 $ 4 = D 
Ee P09q + 70 (0q + "P09 + 20 P09g + “P0q + 109 + 70 {0 “<< 4 + 7o YSW VSS [—D y 0} ang 
“8b a 
- paxi4 > pue g ‘wopuey yp wopuey 2 pur g ‘paxi4 y wopuey D pure ‘g ‘y poxiy > pue g ‘paxiy y asenbs sasenbs wopaei4 UOIJELILA 
Be. it |aPOW It |aPOW 1 japow ueaw jo wins jo $aa139q jo a01n0S 
S 
~~ auenbs uray pa}adx] 
oS 
(L°OL'S) JAPOW 405 adue!IeA Jo sisdjeuy 
D) . 
g 9°S J1aVL 
_ 


304 The Analysis of Variance 


briefly. The model is given by 


Yijkem = UL +a; + Bj +e + de + (@B)ij + (OY dik =4 - _ 
+ (ad)ie + (BY) jx + (BS) je + (VS )ke /* a 
+ (OBY Dijk + (@BS)ijpe + (V9) ix Shr 
+ (BYS) jee + (@BYS)ijne + Cijre f=1,...,d 
J FT] Lykem m=1,...,n, 
(5.11.1) 


where @;;xem’S are independently and normally distributed with zero mean and 
variance o7. The assumptions on other effects can analogously be stated de- 
pending upon whether a factor is fixed or random. Note that the model equation 
(5.11.1) has 17 terms: a general mean, one main effect for each of the four fac- 
tors, six two-factor interactions, four three-factor interactions, one four-factor 
interaction, and a residual or error term. 

The usual identity y;;xem — Y..... =etc., contains the following groups of terms 


on its right-hand-side: 


(i) Estimates of the four main effects, for example, y;... — y...., which gives 
an estimate of a;. 

(11) Estimates of the six two-way interactions, for example, yj;;... — yi... — 
yj... + y...... which gives an estimate of (a@B); i 

(iii) Estimates of the four three-way interactions, for example, yj ;%.. — yij... — 
Vik. — Vijk.. + Vi... FY... FY.&.. — ..... which gives an estimate of 
(a@By); jk 

(iv) Estimate of the single four-way interaction, which will have the form 
Yijxe. — LY... + four main effects + six two-way interactions + four 
three-way interactions]. 

(v) The deviations of the individual observations from the cell means, for 


example, yijxem — Yijke.- 


The partition of the total sum of squares 1s effected by squaring and summing 
over all indices on both sides of the identity y;jxem — y..... = and so on. The 


typical sums of squares and corresponding computational formulae are: 


l . 2 I 2 
~ bcdn ae abcdn> weeee ’ 


a b 
SSap = cdn >> >> (Vij. — Ji. — Ij. FI 
i=1 j=l 


Three-Way and Higher-Order Crossed Classifications 305 


a b c 
SSAaBc =dn) > >> 
j=l k=1 


ue (ee Vik. — V. jk. + Yi... + V5... Vk. veees y 


“(3 r SSp r SSc + as + SSac + SSzc), 


SSascp = — 1S > yee - abedn ———y* —(SS4 + SSz + SS¢ 


i=l j=l k=1 f=! 
+ SSp + SSap + SSac + SSap + SSpc+SSpp+SScep 
+ SSasc + SSacp + SSagp + SSacp), 


SSE = — » y > y > 01K - Vijne.)” 


= =l m= 


b c d n 


Q 


— 
= 
II 
| el 
Se 
II 
— 
= 
I 
— 
cS 
II 
— 


i=l j 


The degrees of freedom for the preceding sums of squares are a — 1, (a — 1) 
(b — 1), (a — 1)(b — 1)(c — 1), (2 — IH — 1)V(e — 1)(d — 1) and abcd(n — 1), 
respectively. The expected mean squares can be derived as before.* For example, 
under Model II, 


E(MS,) = a2 + NO opys + dno ig yt NO. g5 + bnoi.,s + cdno eg 
+ bdno? yt beno2,, + bcdno?, 
E (MSag) = 0; +noig,5 +dnozg, + cno xg; + cdno ig, 
E (MS asc) = a2 + NO ops + dno ig, 
E (MSascp) =o, + NO spys> 


and 


E (MSzg) = o?. 


Expected mean squares under Model III depend on the particular combination 
of fixed and random factors. For example, for an experiment with A and C fixed, 
and B and D random, (aBy5)j;jxe’s are assumed to be distributed with mean 


3 One can use the rules formulated by Schultz (1955) and others to reproduce the results on expected 
mean squares rather quickly. See Appendix U for a discussion of rules for finding expected mean 
squares. 


306 The Analysis of Variance 


zero and variance Olgy 3» subject to the restrictions that }°_, (By 5); ;,4, =O= 
> xa1 (@BY 5); ;,¢. The assumptions imply that 


eS > oars, | = (a —1)(c — 1) ogg,s, 


i=l k=1 
(a—1)(c-1) , 


2 
E [(oBy5); 40] = ac apy, 
E[(aBy5);jxe(@By5)y jee] =0, for j = j’,2 4 £' or both j ¥ j’ 
and £ 4 £', 


— ] 
El (@By9)ijxe(@BY 5); jxe] = — E 7 | ds i #i', 


—] 
E{(@By9)ijxe(@By 5)ijxe] = — E | oars kk’, 


and 


2 
On . of / 
El(oBy8)ijxe(OBy)i jee] ==, i Ai! and k AK. 


Now, the results on expected mean squares follow readily. Finally, for a given 
model — fixed, random, or mixed — the point and interval estimates and tests of 
hypotheses corresponding to parameters of interest can be developed analogous 
to results for the three-way classification. 


Remark: For a balanced crossed classification model involving only random effects, 
there are some simple rules to calculate the coefficients of the variance components in 
the expected mean square. The rules are stated as follows for a model containing four 
factors. 


(a) All expected mean squares contain oa? with coefficient 1. 

(b) The coefficient of a variance component is zero or abcdn divided by the product 
of the levels of the factors contained in the variance component. For example, 
the coefficient of Oxpy 3 1s equal to abcdn/abcd =n. 

(c) The coefficient of the variance component in the expected mean square of a 
main factor or interaction between factors is zero if the product of the levels 
of the factors contained in the variance component cannot be divided by the 
level of the factor or the product of the levels of the factors. For example, 
the coefficient of Of, in E(MSa) is zero since be cannot be divided by a. 
Similarly the coefficient of OB, 5 in E(MSaaz) is zero since bcd cannot be divided 
by ab. 

(d) A quick check on the correctness of the coefficients of variance components can 
be made by noting that for a given variance component, the weighted sum of 


Three-Way and Higher-Order Crossed Classifications 307 


coefficients corresponding to all the mean squares, including that of the mean, 
is abcdn, where the weights are taken as the degrees of freedom. 


5.12 HIGHER-ORDER CROSSED CLASSIFICATIONS 


The reader should now be able to see how the four-way crossed classification 
analysis can be further generalized to five- and higher-order classifications. 
The formal symmetry in the sums of squares and degrees of freedom for the 
balanced case makes direct generalizations to the higher-order crossed clas- 
sification models quite straightforward. For example, a full p-way crossed 
classification involving p crossed factors contains 2? + 1 terms in the model 
equation: a general mean, one main effect for each of the p-factors, (5) two- 
factor interactions, ( 5 ) three-factor interactions, and so on; and the total number 
of main effects and interactions to be tested is equal to 2? — 1. Computational 
formulae given in the preceding section can be readily extended if more than 
four factors are studied simultaneously. However, when the number of factors 
is large, the algebra becomes extremely tedious and the amount of computa- 
tional work increases rapidly. Most of the algebra can be simplified by the 
use of “operators” as discussed by Bankier (1960a,b). The details on mecha- 
nization of the computational procedure on a digital computer can be found 
in the papers of Hartley (1956), Hemmerle (1964), and Bock (1963), and in 
the books by Peng (1967, pp. 47-50), Cooley and Lohnes (1962), and Dixon 
(1992). Hartley (1962) has suggested a simple and ingenious device of using 
a factorial analysis of variance without replication (with as many factors as 
necessary) to analyze many other designs on a digital computer, where data 
from any design are presented and analyzed as though they were a factorial 
experiment. 

There are several procedures for deriving expected mean squares in an anal- 
ysis of variance involving higher-order crossed classification models. They are, 
however, more readily written down following an easy set of rules. The in- 
terested reader is referred to papers by Schultz (1955), Cornfield and Tukey 
(1956), Millman and Glass (1967), Henderson (1959, 1969), Lorenzen (1977), 
and Blackwell et al. (1991), including books by Bennett and Franklin (1954), 
Scheffé (1959), and Lorenzen and Anderson (1993) for detailed discussions of 
these rules. A brief description of these rules is given in Appendix U. Finally, it 
should be stressed that in a higher-order crossed classification involving many 
factors, the complexity of the experiment as well as the analysis of data in- 
creases as the number of factors becomes large. In addition to providing a large 
number of experimental units, there are many interaction terms that must be 
evaluated and interpreted. Moreover, the tasks of evaluating the expected mean 
squares and performing the tests of significance for each source of variation also 
become increasingly complex. One common source of difficulty encountered 
in analyzing higher-order random and mixed factorials is that there is often no 
appropriate error term against which to test a given mean square. Frequently, 


308 The Analysis of Variance 


the appropriate tests are carried out using an approximate procedure due to 
Satterthwaite (1946). It should, however, be mentioned that although the num- 
ber of interaction terms in a higher-order classification increases rather rapidly, 
in many cases these interactions are so remote and difficult to interpret that they 
are frequently ignored and their sums of squares and degrees of freedom pooled 
with the residual. 

We outline below an analysis of variance for the r-way classification involv- 
ing factors, A}, A2,..., A,. Let a; be the number of levels associated with the 
factor A; (i = 1,2,...,7), and suppose there are n observations to be taken 
at every combination of the levels of A,, A2,..., A,. The model for a r-way 
classification can be written as 


i; =1,2,...,a 
Vizin..ips = M+ (01); +--+ + ri, + (O1O2)ii, H e+ | i575 =1,2,...,€ 
+ (Q,_1Or)i,_ yi, + (010203 )iyinz, °° 
+ (20,1 Qt, Vir rip yi, tres 


+ (0102... O, Diyini, H Cijin...i-s ip =1,2,...,a, 
s=1,2,...,n, 
(5.12.1) 


where Jj,i,..i,5 1s the s-th observation corresponding to the i-th level of Aj, 
i-th level of Az, ..., and i,-th level of A,;; —oo < pw < ois acconstant; (@;);, 
is the effect of the ij-th level of Aj (j = 1,2,...,1r);(@ja@x)i,i, 18 the ef- 
fect of the interaction between the i;-th level of A; and the i,-th level of 
Ax (j <k =1,2,...,17); (joe); ,i,i, 18 the interaction between the i ;-th level 
Aj, the ix-th level of Ax, and the ig-th the level of Ag (j < kK <£=1,2,..., 
r);...3(@ 02... @,)j,i,...;, 18 the interaction between the 7;-th level of Aj, the /- 
th level of A2, ..., and the i,-th level of A,; and finally e;,;,..;5 1s the customary 
error term. 

The usual identity y;,;,..i5 — 
terms on its right-hand side: 


= etc., contains the following groups of 


e Estimates of r main effects; e.g., ¥,;,.... — ¥.... which gives an estimate 
of (@;)i; G = |, 2, wee Ir). 
e Estimates of (5) two-way interactions; €.8., Vi jig... — Yipee Vuigewe 7 


y...... which gives an estimate of (@ ja, )ji,i, (i < kK =1,2,...,7). 


e Estimate of the single r-way interaction of the form yj,i,..i,. — LY... + 
r main effects + (5) two-way interactions +---+ (, )(r — 1)-way 
interactions]. 

e The deviations of the individual observations from the cell means; e.g., 


Vijin...t-s — Yiyiz...i,.- 


Three-Way and Higher-Order Crossed Classifications 309 


The partition of the total sum of squares is effected by squaring and summing 
over all indices of the identity y;,;,..is — y..... = etc. The typical sums of squares 
can be expressed as follows: 


at 
SSa, = ana3...a,n > (Fi _ — ¥...)’, ete., 


a) a2 


SS4,A) = 4304...a,N n> X (Vi,i, eeeee ~~ Yi, eases ™ Y.iy eeeee + y eeeee y’, e(c., 


ij=1 i= 


a) a2 a, 
SSA, A2..4, = 0 so e . [(Vivinip, — Virinnipan Foe 
i=] n= i;=1 
+(-1! Giri... YA(-D Gin Fo) 
+(-1D'5. 7, 
and 
ay a2 a, n 
SSE = > yo. > Oni .. US — Yiji os .i,.)*. 

j=l n=l i,=1 s=l 


The degrees of freedom for the above sums of square are a, — 1, (a; — 1) 
(a, — 1),..., (a) — 1)(a@2 — 1)... (a, — 1) and ajaz...a,(n — 1), respectively. 

Under Model I, (@;);,(/ = 1,2,.--,7), @jan)ij, GG <k = 1,2,---,7), 
(O jOpOe)iipi, F< k <L= 1,2,...,r),..., and (@)@2...a,)j,;,..i, are assum- 
ed to be constants subject to the following restrictions: 


aj 
Yi), =0, f=1,2,...575 


ij=1 
aj ak 
’ (ji ji — ; (jk )i i, = 0, J <k = l, 2,..-13 
ij=l ip=l 
aj ak 
) (Oj OQ )i ii, = ) (0¢ j 00.060); i iy 
ij=] i=l 


ag 
— Y(@joKee)i,igi, =O, j<k<€=1,2,...,7; 


ig=1 


310 The Analysis of Variance 


and 


Qa\ a2 
Yd (@iey tte Or )isin...i, = Y (aia, see Or )itin...i, = 


i=] i2=1 


ar 
= > (@1Q2...Q,)ii,..i, = O- 


i=l 


Furthermore, @é;,;,..;,5’S are uncorrelated and randomly distributed with zero 
means and variance o7. The expected mean squares are obtained as follows: 


a 
Q703...a;n 
E(MSa,) = oa? + aol ) (1); , etc., 
1 ij=1 


2 Q3a4.. 
E(MSa,a,) = 0, + ———— (ay _ Na _ 1) » Y (a1a)?,,, etc 


ij=1 in=1 


n 
E(MS — 2 ——$—— 
(MS 4,4)...4,) = Oo + G@nol).G nd) 


x 7 Sy... Sea... On YP in ip 


ij=1 i= | i;=1 
and 
E(MSz) = o?. 


Assuming normality for the error terms, all tests of hypotheses are carried out 
by an appropriate F statistic obtained as the ratio of the mean square of the 
effect being tested to the error mean square. 

Under Model II, (@j)j,’s, (@jOx)i;i,’S, (@jOKOe)i,igig’S,-- +, (@102... 
Qt, )i,i,...i, S, and e;,;,.;,5’S are assumed to be mutually and completely uncorrec- 


ted random variables with zero means and variances 0f.,05 as cpap? **> 
O% ao,» and o? respectively. From (5.12.1), the variance of any observation 
is 
2 2 
Var(Yi,iz...i-s) = ~~ OZ + OZ + Oo wy +: “TO of 1Q, + i + Oo 07...0, + 0, ? 
2 2.42 20. eg? ; 2 
and, thus, Oy. +--+, O¢ 3S Q.a,9 +++» Sq, s0,3° ++) Sayay...a, and o;, are the variance 


Three-Way and Higher-Order Crossed Classifications 311 


components of model (5.12.1). The expected value of the mean square corre- 
sponding to any source; for example, the interaction Aj, x Aj, x --: x Aj, 1S 
obtained as follows: 


2 y } 2 
Oe + n Dhy kok p? cg, Oty «kp ’ 


where the summation is carried over all the variance components except o?. 
The coefficients qx,4,.., Of the variance components are given by 


A1Gn Gr if (j1, ja, +++ jm) is a Subset of (ky, kz, ..., Kp) 
Wk ko...kp = AK, Ak, --- Ak, 
P 0, otherwise. 


Remark: In many factorial experiments involving a r-way classification where the 
levels of the factor correspond to a fixed measure quantity, such as levels of temperature, 
quantity of fertilizer, etc., the investigator is often interested in studying the nature of 
the response surface. For instance, she may want to determine the value at which the 
response surface is maximum or minimum. For discussions of the response surface 
methodology, the reader is referred to Myers (1976), Box and Draper (1987), Myers and 
Montgomery (1995), and Khuri and Cornell (1996). 


5.13 UNEQUAL SAMPLE SIZES IN THREE- AND 
HIGHER-ORDER CLASSIFICATIONS 


When the sample sizes in three- or higher-order crossed classifications are 
not all equal, the procedures described in Section 4.10 can be used with the 
customary modifications. The formulae for the two-way model need simply 
be extended for experiments involving three and more factors. However, the 
computation of the analysis of variance in the general case of disproportionate 
frequencies tends to be extremely involved. Various aspects of the analysis of 
nonorthogonal three- and higher-order classifications have been considered by 
a number of authors. For further discussions and details the interested reader is 
referred to Kendall et al. (1983, Sections 35.43 and 35.44) and references cited 
therein. 

In the following, we outline an analysis of variance for the unbalanced three- 
way crossed classification and indicate its extension to higher-order classifica- 
tions. The model for the three-way crossed classification remains the same as 
in (5.1.1), except that the sample size corresponding to the (i, 7, &)-th cell will 
now be denoted by nj;,. We consider the analysis when the unequal sample 
sizes follow a proportional pattern; that is, 


Nj 7 Nk 


Nijk = 
] N 


312 The Analysis of Variance 


For this case, the analysis of variance discussed earlier in this chapter can 
be employed with suitable modifications. For example, the definitional and 
computational formulae for the sums of squares are: 


a b Cc Nijk a b Cc Nijk 2 
a) 2 yi. 
SSr = DDD, Dice - 5. =D DD Yijke ~ Ay 
i=] j=l k=1 t=! i=] j=l k=l f=! 
a a 2 2 
_ _ yj y 
SS, = Doni... — 3. = DoE - =, 
i=l j=) Mi. N 
b b y*, y? 
SSp = onj(vj.-5.P = oS 
— —~ nn |. N 
j=l j=l J 
Cc Cc 2 2 
_ _ y y 
SSc =) n5e. — J. = DE oe, 
k=l kal [Ak 
a b 
SSap = SY) ni. ij. — 9. — 9.45.9 
i=l j=l 
Pin SY en is Ye 
= Me eh 
i=l j=l Nij j=) Mi j=l nj 
a Cc 
SSac = nin(Vir. — Vi. — Vk. + 9...) 
iz] k=] 
a Cc 2 a 2 Cc 2 2 
i.k Yi... k yo. 
— rik. Pi rok 4 Do 
i=l kal [ik don dons N 
b Cc 
SSac = 3 nj(¥.jk. — Vj. — Vk. YY 
j=l k=l 
bc y2 b 2 c 12 2 
jk yj k y 
_ — — —_ oh + — 
a b Cc 
- ~ - - ~ _ \2 
SSasc = > NijkVijk. — Yij.. — Vir. — Y.jk. + i... + Yj. + Yk. OY.) 
i=l j=l k=l 
. N;; 7 


Three-Way and Higher-Order Crossed Classifications 313 


and 
a b Cc Nijk 
— 2 
SSE = y (Vijne — Yijk.) 
i=1 j=l k=) é=1 
a b c Nijk a b Cc 2 
_ 2 Yijk. 
= » } Yijke — > } We? 
i=1 j=l k=1 é=1 i=1 j=l k=1 © Lk 
where 
Nijk 
Yijk. = ) Vijke» Vik. = Yijk. /Mijks 
é=1 
C 
Vij. = ) Vijk.» Vij.. = Yij../Nij.» 
k=1 
b 
Vik. = ) Vijk> Vik. = Vir. /Niks 
j=l 
a 
Yjk. =) Vijk.s Y jk. = Y.jk./N jk; 
i=] 
b 
Yi. = y Yij..s Yi... =i... /Ni.., 
j=l 
a 
Vp = Din Vij. =Vj/N js 
i=l 
a 
ye =) Vike Yok. =Y.k./N ks 
i=l 
a b C 
y =) v=) yji=) Yk» Yi... =Yy../N, 
i=l j=l k= 
Cc b a 
nj, = y Nijk, Nik = y Nijk, Nj = ) Nijk, 
k=1 j=l i=l 
b a a 
nj = Nij.; nj=> Nij., ne=)> Nik, 
j=l i=1 i=l 
and 
a b C 
N= n=) nj =>dong 
i=l j=l k=1 


The analysis of variance with expected mean squares for the fixed effects model 


314 The Analysis of Variance 


TABLE 5.9 
Analysis of Variance for the Unbalanced Fixed Effects Model in 
(5.1.1) with Proportional Frequencies 


Source of Degrees of Sum of Mean Expected Mean 
Variation Freedom Squares Square Square 
1 a 
Due to A a—1 SS 4 MS, a2 + qo > nj..0? 
a—1 j= 
1 b 
Due to B b-1 SS MS3 of + b-1 15 BF 
“15 
1 c 
Due to C c-—1 SSc MSc o2 + c- 1 > nig vy; 
“4 k=] 
] 
Interacti —1)(b-1 SS MS 2 4 —________ 
merac ion (a — 1) ) AB AB ot (a—l(b-l 
x 


a b 
x Ye ni. (ap); 


i=! j=l 
— 
(a —1)(c—-1) 


x » > nj a(ay)?, 


Interaction (a — 1)(c — 1) SSac MSac of? + 
AxC 


i 
. 
Il 


Interaction 
b—1\(c-1 SS MS ; + 7 
BxC ( (ce — 1) BC BC os + b-ite-) 
b c 
x Yn (By) 
j=l k=] 
l 
Interacti —1\(b—1\(c—1 SS MS . 
nteraction (a — 1) Yc — 1) ABC ABC oe + (a —1)(b—1)(c—-1) 
AxBxC 


a boc 
x VY DY nije OBy iin 


i=1 j=l k=I 
Error N —abc SSE MSE o2 


is given in Table 5.9. Under the assumption of normality, the test procedures 
are performed as in the case of corresponding balanced analysis. The expected 
mean squares for the random and mixed effects in the general case of dispropor- 
tional frequencies are extremely involved and the interested reader is referred 
to Blischke (1966) for further information and details. 

The model for the r-way crossed classification remains the same as in 
(5.12.1), except that the sample size corresponding to (ij, i2,...,i,)-th cell 
will now be denoted by n;,;,..;,. The analysis when the unequal sample sizes 
follow a proportional pattern follows readily on the lines of three-way crossed 
classification outlined above. For further information and details, the reader is 
referred to Blischke (1968). 


5.14 WORKED EXAMPLE FOR MODEL I 


Anderson and Bancroft (1952, p. 291) reported data from an experiment de- 
signed to study the effect of electrolytic chromium plate as a source for the 


Three-Way and Higher-Order Crossed Classifications 315 


chromium impregnation of low-carbon steel wire. The experiment involved 18 
treatments obtained as a combination of three diffusion temperatures (2200°F, 
2350°F, 2500°F), three diffusion times (4, 8, and 12 hours), and two degrasing 
treatments (yes and no). Each treatment was applied on four wires giving a total 
of 72 determinations on average resistivities (in m-ohms/cm?) which was the 
variable being studied. The data are given in Table 5.10. 

The data in Table 5.10 can be regarded as a three-way classification with 
four observations per cell. Note that all three factors, temperature, time, and 
degrasing, should be regarded as fixed effects since the interest is directed only 
to the levels of the factors included in the experiment. The mathematical model 
for the experiment would be 


i 
Vijke = M+ a; + By + V+ (OB) +(OY)iK +BY) J 
+ (@By Dijk + Cijne k= 
£ 


where  z is the general mean, a; 1s the effect of the i-th level of temperature, 8; 
is the effect of the j-th level of time, jy is the effect of the k-th level of degrasing, 
(a@B);; 1s the interaction between the i-th temperature and the j-th time, (ay )jx 
is the interaction between the i-th temperature and k-th degrasing, (By) jx 1S 
the interaction between the i-th time and the k-th degrasing, (wBy)j;x 1s the 
interaction between the i-th temperature, the j-th time and the k-th degrasing, 
and é;jx¢ is the customary error term. Furthermore, it is assumed that the a;’s, 
B'S, Ve’S, (@B)ij’S, (@Y ix’, (BY) jx’S, and (@By);;,’s are constants subject to 
the restrictions: 


3 3 2 
Yo a; =) B= > nx =0, 


i=l j=l k=1 


3 3 3 2 
Y> @B)i; = > @B); = > @v)ix = >> @Y iz 
i=] j=l i=l k=1 


3 2 
=) > (By) jx = >> (BY) jx = 9, 
j=l 


k=1 


3 3 2 
SY) @BY ijk = > (OBY)i jx = S> OBY diz = 0, 
i=l j=l k=1 


and the e;;x¢’s are independently and normally distributed with mean zero and 
variance o?. 
The first step in the analysis of variance computations is to form a three-way 


table of cell totals containing yj; jx. = a yijxe (see Table 5.11). The next step 


The Analysis of Variance 


316 


"UOISSIUIod YIM poss ‘(167 ‘d ‘ZS6]) Yorourg pue uossapuy -aouno¢ 


C'0¢ 8 0C 
L0¢ 0-02 
C02 10d 
0°02 661 
ON SA 
sau ZL 


v'0¢ 
L‘0¢ 
€6l 
Cc 6l 


Cc 6l 
v'0c 
06! 
Cc 6l 


s1y 


41,0002 


L'6l 06l 

9°81 Ll 

681 081 

81 6 LI 

ON sad Suiseisaq 
S4y P oul] 


697¢ CLE 8972 69C OFT Cte Lt 6CC £7 CCC 9 C1? 
6'S7 VST 6S2 £97 Ot LTC Ck? SEC 672 CCC LI? CIC 
C9 89C 99 69¢ OPC tC OE Cet? BITC Lec COC pv 02 
O'LZ G9% SST 697 672 BCC 6£C Cl~T Cte LCC Vee C1? 
ON SaA ON S9A ON S9A ON S9A ON SQA ON SaA 
say ZL Siu 8 SAU > sau ZL slug SAU 
41,0082 40SEC 
ainyesadway 


ajdwiexy 9}]q WNIWOAYD 31}A]04}99]Z 404 (,WID/SWIYO-W Ul) SAIJIAISISAY AaseIVAY UO B}EG 


O'S FTEAVL 


Three-Way and Higher-Order Crossed Classifications 317 


TABLE 5.11 
Cell Totals Vijk. 
Temperature 
2200°F 2350°F 2500°F 
Time 4hrs Shrs 12hrs 4hrs 8hrs 12hrs 4hrs 8hrs_ 12 hrs 


Degrasing Yes 73.6 77.8 80.8 840 90.4 93.2 91.1 107.0 105.9 
No 74.7 79.6 80.8 86.2 90.2 944 92.9 1048 106.0 


TABLE 5.12 
Sums over Levels of Degrasing yj. 
Time 
Temperature (°F) 4hrs S8hrs  12hrs Yj... 
2200 148.3 157.4 161.6 467.3 
2350 170.2 180.6 187.6 538.4 
2500 184.0 211.8 211.9 607.7 
Yj. 502.5 549.8 561.1 Von. 
1,613.4 
TABLE 5.13 
Sums over Levels of Time y;x. 
Degrasing 
Temperature (°F) Yes No yj. 
2200 232.2 235.1 467.3 
2350 267.6 270.8 538.4 
2500 304.0 303.7 607.7 
Yk 803.8 809.6 Y.... 
1,613.4 


consists of forming sums over every index and every combination of indices. 
Thus, we sum over the levels of degrasing to get a temperature (7) x time (/) 
table containing yj;.. = yar yijx. (see Table 5.12). This table is then summed 
over the levels of time to obtain y;. and over the levels of temperature to give 
yj... The sum of y;._ is equal to the sum of y.;.. which 1s the grand total y... 
Tables 5.13 and 5.14 are obtained similarly. 


318 The Analysis of Variance 


TABLE 5.14 
Sums over Levels of Temperature y.jx. 
Degrasing 
Time (hrs) Yes No Yj. 
4 248.7 253.8 502.5 
8 275.2 274.6 549.8 
12 279.9 281.2 561.1 
YR 803.8 809.6 y 


With these preliminary results, the subsequent analysis of variance calcu- 
lations are fairly straightforward. Thus, from the computational formulae of 
Section 5.7, we have 


1,613.4)? 
SS; = (17.9)? + (18.0)* +--+ (26.9)? — _ 6b)" 
3x3x2x4 
= 36,677.86 — 36, 153.6050 
= 524.2550, 
1 (1,613.4)* 
SS, = ———-{(467.3)* + (538.4) 07.7)7} — —————_ 
A= 33 x Gg O73)" + ( y+ (007-1) — Ta x4 
= 36,564.2975 — 36,153.6050 
— 410.6925, 
] (1,613.4) 
SSp = ———((502.5)* + (549.8)* + (561.1)7} — —————— 
8 xara! y+ y+ y} 3x3x2x4 
= 36,234.1458 — 36,153.6050 
= 80.5408, 
1 (1,613.4)? 
SSc = ——— |(803.8)* + (809.6)*} — —————_ 
C= 3x3xal! y+ I~ 33K ax4 
= 36,154.0722 — 36,153.6050 
= 0.4672, 


1 
SSap = 5g (148.3 + (157.4)? +--+» +(211.9)*} 


(1,613.4)? 
3x3x2x4 
= 36,659.6525 — 36,153.6050 — 410.6925 — 80.5408 


= 14.8142, 


— SS, — SSp 


Three-Way and Higher-Order Crossed Classifications 319 


] 
—— {(232.2)* + (235.1)? + --- + (303.7)"} 


SSac = 
AC 3x4 
(1,613.4) 5s, _ §s 
3x3x2x4 A C 
= 36,565.0783 — 36,153.6050 — 410.6925 — 0.4672 
= 0.3136, 
1 
SSec = Tg (248. + (253.8)? +--+ + (281.2)*} 
x 
(1,613.4) 
—_ ——___~_ _ §S, — SS 
3x3x2x4 8 c 
— 36,235.3150 — 36,153.6050 — 80.5408 — 0.4672 
= 0.7020, 
and 
1 (1,613.4) 
SS — —{(73.6)* + (74.7) +--+ (106.0)7} — ————_—_ 
ABC a \¢ + ( y+: +( 3 ax 3K2x4 


—SSasp —SSac — SSac — SSa — SSp — SSc 

= 36,662.01 — 36,153.6050 — 14.8142 — 0.3136 — 0.7020 
— 410.6925 — 80.5408 — 0.4672 

= 0.8747. 


Finally, by subtraction, the error sum of squares 1s given by 


SSe = 524.2550 — 410.6925 — 80.5408 — 0.4672 — 14.8142 
— 0.3136 — 0.7020 — 0.8747 
= 15.8500. 


These results along with the remaining calculations are entered in Table 5.15. 

From Table 5.15 it is evident that the variance ratio for the three-factor inter- 
action (i.e., temperature x time x degrasing) is less than one indicating a non- 
significant effect (p = 0.566). The variance ratios for temperature x degrasing 
as well as time x degrasing interactions are also too low to achieve any sig- 
nificance. However, the variance ratio for temperature x time interactions is 
quite large and highly significant (p <0.001). In terms of the main effects, 
the variance ratio for the degrasing effect is relatively small and nonsignifi- 
cant (p = 0.213). However, the variance ratios for temperature and time main 
effects are extremely large and highly significant (p < 0.001). Hence, we may 


The Analysis of Variance 


320 


99¢°0 cL0 
OIc¢ 0 0c'l 
68°°0 tS'0 


100°0> c9'CI 


C170 6S] 


100°0> Oc LEI 


100°0> 09°669 


anjea-d = anjea 4 


Ae Wi T AG T “li l 


I=yI=f1=1 _ _ 
YA go) < <% x (I — 21 : EI — €) 4 
ff 


O 

I= OU-O) |, 
pxe ¢ 
UA OU=9 | 20 
ypxe ¢ 
I= OI=O9 , a, 
yxZ ¢ 


aaenbs ueaw 
pa}dedx3 


St6c 0 


L817 0 


O1Se 0 


89ST 0 


StOLl'e 


CLOV 0 


vOL? OF 


COPE SOT 


aaenbs 
ueaw 


OSSc Hes 
00S8'SI 


LvLs 0 


0c0L 0 


9C1¢ 0 


CVIS HI 


cLOV 0 


80S 08 


$c69 OIP 


sauenbs 
jo wing 


JO saaisaq 


[BIOL 


JOY 


SUISBISOG X OUI] X sINjeIodwoy, 


SuIseIsoq X OUI], 


SuIseisaq xX ainjelodwiay, 


su], x anjerodwiay, 


suIseisoqd 


OUI 


oinjeradwidy, 


UON}ELIA 
jo a01Nn0S 


OL'S a]qey so eyEG APANSISaY BY} 10J DDULLILA Jo SISAjeUY 


SL°S FIaVL 


Three-Way and Higher-Order Crossed Classifications 321 


reach the following conclusions: 


(i) There are no three-factor (i.e., temperature x time x degrasing) inter- 
actions. 

(ii) There are no two-factor interactions between degrasing and either of the 
other two factors — temperature and time. However, there are significant 
interactions between temperature and time. 

(iii) There are two main effects, that is, due to temperature and time. There 
are no main effects for degrasing. 


In view of the presence of significant temperature x time interactions, the tests 
for temperature and time main effects are not particularly meaningful. The 
researcher should first investigate the nature of temperature x time interactions 
before determining whether main effects are of any practical interest. 

To study the nature of the temperature x time interactions, suppose the 
researcher wished to estimate separately, the differences in average resistiv- 
ities for three diffusion temperatures for three diffusion times. The contrasts of 
interest are: 


LXj=pan.-pen., Lo=e31.- fa. L3 = 431.- Lin, 
Lg=en. - bi, bs =p32.-en, Lo = 432.- 12; 
L7 = p3.-13., Lg = 433.—- 13., Lo = 33. — 113.- 


The preceding contrasts are estimated as 


a 170.2 148.3 A 184.0 170.2 
Ly = — - — =2.74, Lo, = — - — = 1.73, 
8 8 8 8 
a 184.0 148.3 A 180.6 157.4 
L3 = —— -— — =446, Ly = — —- — = 2.90, 
8 8 8 8 
a 211.8 180.6 A 211.8 157.4 
Ls = —— — — =3.90, Le = —— —- — = 6.80, 
8 8 8 8 
a 187.6 161.6 a 211.9 187.6 
Ly = —— — —— =3.25, Lg = —— —- — = 3.04, 
8 8 8 8 
a 211.9 161.6 
Lo = —— —- — = 6.29. 
8 8 
The estimated variances are obtained as 
Var(L,) = Var(L2) = --: = Var(Lo) = - [(d)° + (-1)°] 
0.2935 
= (2) = 0.073. 
4x2 


The desired 95 percent Bonferroni intervals for the contrasts of interest are 


322 The Analysis of Variance 


determined as 
iL, +1[54, 1 — 0.05/18],/ Var(Z;) = 2; £2.89 x 0.270, i=1,...,9; 
that 1s, 


1.96 = 2.74 — 2.89 x 0.270 < poy. — Wy. < 2.74 + 2.89 x 0.270 = 3.52 
0.95 = 1.73 — 2.89 x 0.270 < m3. — fa. < 1.73 + 2.89 x 0.270 = 2.51 
3.68 = 4.46 — 2.89 x 0.270 < 3). — Wy, < 4.46 + 2.89 x 0.270 = 5.24 
2.12 = 2.90 — 2.89 x 0.270 < jaro — p12. < 2.90 + 2.89 x 0.270 = 3.68 
3.12 = 3.90 — 2.89 x 0.270 < p32. — fr. < 3.90 + 2.89 x 0.270 = 4.68 
6.02 = 6.80 — 2.89 x 0.270 < ju3o. — 12, < 6.80 + 2.89 x 0.270 = 7.58 
2.47 = 3.25 — 2.89 x 0.270 < pars, — f13, < 3.25 + 2.89 x 0.270 = 4.03 
2.26 = 3.04 — 2.89 x 0.270 < ju33, — 23, < 3.04 + 2.89 x 0.270 = 3.82 
5.51 = 6.29 — 2.89 x 0.270 < 33. — 1143, < 6.29 + 2.89 x 0.270 = 7.07. 


The average resistivities for different combinations of diffusion temperatures 
and times indicate that average resistivity increases when going from lower to 
higher temperature levels. However, the increases are greater as one moves from 
lower to higher levels of diffusion time. The different influence of diffusion 
temperature, which depends on the diffusion time, implies that the temperature 
and time factors interact in their effect on resistivity. In view of the important 
interaction effects between temperature and time on average resistivities in the 
study findings, the researcher may decide that main effects due to temperature 
and time are not meaningful or of practical importance. 


5.15 WORKED EXAMPLE FOR MODEL II 


Johnson and Leone (1977, p. 861) reported data from an experiment designed to 
study the melting point of ahomogeneous sample of hydroquinone. The experi- 
ment was performed with three analysts using three uncalibrated thermometers 
and working in three separate weeks. The data are given in Table 5.16. 

The data in Table 5.16 can be regarded as a three-way classification with one 
observation per cell and all three factors can be regarded as random effects. The 
mathematical model for this experiment would be 


i=1,2,3 
Vijk = UW + aj + By + V+ (OB) + OV )ik + BY) je + eije ¥ J=1,2,3 
k=1,2,3 


3 


3 


where jz is the general mean, a; is the effect of the i-th thermometer, 6; is 
the effect of the j-th week, y is the effect of the k-th analyst, (@B),; is the 


Three-Way and Higher-Order Crossed Classifications 323 


TABLE 5.16 
Data on the Melting Points of a Homogeneous Sample of Hydroquinone 
Thermometer 
1 2 3 
Week 1 2 3 1 2 3 1 2 3 


Analyst 1 1740 1735 1745 173.0 173.5 173.0 171.5 172.5 173.0 
2 173.00 173.0 1735 172.00 173.0 173.5 171.0 172.0 171.5 
3 1735 173.0 173.0 173.0 173.5 172.55 173.0 173.0 172.5 


Source: Johnson and Leone (1977, p. 861). Used with permission. 


TABLE 5.17 
Sums over Analysts yj. 
Week (j) 

Thermometer (i) 1 2 3 Yj. 
1 520.5 519.5 521.0 1,561.0 
2 518.0 520.0 519.0 1,557.0 
3 515.5 517.5 517.0 1,550.0 
Yj. 1,554.0 1,557.0 1,557.0 y 


4,668.0 


interaction of the z-th thermometer with the j-th week, (ay )j;, is the interaction 
of the i-th thermometer with the k-th analyst, (By);, is the interaction of the 
j-th week with the k-th analyst, and e;;, 1s the customary error term. In order 
to estimate the error variance, it is assumed that thermometer x week x analyst 
interaction is zero. Furthermore, it is assumed that the a;’s, Bj’s, y's, (@B)j;;’S, 
(ay )ix’S, (By) jx’S, and e;;,’S are independently and normally distributed with 
zero means and variances 02, OR, oy, ap? oj, Op and a2, respectively. 

The first step in the analysis of variance computations consists of forming 
sums over every index and every combination of indices. Thus, we sum over 
analysts (k) to obtain a thermometer (i) x week (j) table containing y;j, = 
yar yijk (See Table 5.17). This table is then summed over the levels of week 
(j) to obtain the thermometer totals ();..) and again over thermometers (i) to 
obtain week totals (y.;.). Now, the sum of the thermometer totals is equal to the 
sum of the week totals, which gives the grand total (y..). Finally, Tables 5.18 
and 5.19 are obtained in a similar way. 

With these preliminary results, the subsequent analysis of variance calcu- 
lations are rather straightforward. Thus, from the computational formulae of 


324 


TABLE 5.18 


Sums Over Weeks y;.x 


The Analysis of Variance 


Thermometer (/) 


Analyst (k) 1 
1 522.0 
2 519.5 
3 519.5 
Yj. 1,561.0 
TABLE 5.19 


Sums Over Thermometer y jx 


Analyst (k) 1 
T 518.5 
2 516.0 
3 519.5 
Yj. 1,554.0 


Section 5.10, we have 


SSr = (174.0)? + (173.0)? + --- + (172.5)? — 


= 807,061 — 807,045.333 = 15.667, 
(1,561.0)? + (1,557.0)? + (1,550.0)? (4,668.0) 


3 Y..k 


517.0 1,558.5 
514.5 1,552.5 
518.5 1,557.0 


1,550.0 y. 


1,557.0 Y... 
4,668.0 
(4,668.0) 


3x3x3 


SS, = vee 
3x3 3x3x3 
= 807,052.222 — 807,045.333 = 6.889, 
ss, = (1,554.0)? + (1,557.0)* + (1,557.0) _ (4,668.0)? 
3x3 3x3x3 
= 807,046.000 — 807,045.333 = 0.667, 
SS. = (1,558.5)? + (1,552.5)? + (1,557.0) - (4,668.0) 
3x 3 3x3x3 
= 807,047.500 — 807,045.333 = 2.167, 
2 2 2 2 
SS45 = (520.5)* + (519.5)° +---+(517.0) — (4,668.0)° $5, — $8, 


= 807,054.000 — 807,045.333 — 6.889 — 0.667 


= 1.111, 


Three-Way and Higher-Order Crossed Classifications 325 


SS — oo eee Nee Ee 884 — SSe 
3 3x3x3 
= 807,056.500 — 807,045.333 — 6.889 — 2.167 
= 2.111, 
and 
518.5)* + (519.5)? 518.0)? (4,668.0)? 
SSac = O13)" FOO) Fe FOO 4008.0)" og sso 
3 3x3x3 
= 807,049.830 — 807,045.333 — 0.667 — 2.167 


= 1.667. 


Finally, by subtraction, the error (three-way interaction) sum of squares 1s given 
by 


SS- = 15.667 — 6.889 — 0.667 — 2.167 — 1.111 — 2.111 — 1.667 
= 1.055. 


These results along with the remaining calculations are shown in Table 5.20. 
From Table 5.20 it is clear that the hypotheses on the two-factor interactions, 
namely, 


Hy” Oop =0 versus Hy”: Oop > 0, 


AC. .2 _ AC . 
Hj :0y, =9 versus H; on, > 0, 


and 


BC. 2 
HE of, =0 versus H; :og, > 0, 


are all tested against the error (thermometer x week x analyst interaction) term. 
On examining the last two columns of Table 5.20, it is seen that only the 
thermometer x analyst interaction is significant (p = 0.045). 

Now, we note that there are no exact F tests for testing the main effects 
hypotheses, namely, 


Hj:02=0 versus Hj‘':o2>0, 


Ho :0; =0 versus HP a; > 0, 


and 


Hs [Oy =0 versus Hy [0 > 0, 
unless we are willing to assume that certain two-factor interactions are zero. 
Since we have just concluded that some two-factor interactions are insignificant, 
we can obtain exact tests for certain hypotheses by assuming the corresponding 


The Analysis of Variance 


326 


eee 


820°0 
Sv0'0 
IZ1'0 
ce 0 
686°0 
€90°0 


anjea-d 


91'€ 
00°r 
[1'? 
col 
6¢°0 
Its 


anjeA 4 


L99°SI 9¢ [BIOL 

(askjeuy x Yoo X JOJSWIOUWIOY]) 

72 = ZE10 cs0'l 8 JOM 

“Bog +79 LIvO L99'I 4 ISATRUY X OOM 

Pog + 70 8cC'0 LIV'2 4 ysAJeuy X JOJOWIOULIOY |, 

Mog +79 870 LIT 7 YooA\ X JOJOUMOUWIOU ], 

ZOE xX E+ “Jog + “Pog+?o Egol LOZ Z IsAyeuy 
fog x E+ “Bog + Fog +79 €€€0 L990 Z YOO 
ZOE X E+ “Pog + BP o¢ +79 bre 688°9 4 I9}OWOUWIOY |, 
aaenbs ueayw aaenbs = sauenbs wopaai4 UOI}ELIBA 
pajoadx] ueaw 8 =jouwns = jo saausaq JO ad4n0¢s 


91'S afqe] JO eye JUIOY SuNjaw Jy} 40} adueLRA Jo siskjeuy 
Oc'S J1EaVL 


OO CO eee ere — — ee 


Three-Way and Higher-Order Crossed Classifications 327 


two-factor interactions to be zero. But, if we are unwilling to assume that 
any two-factor interactions are zero, then as we know there are no exact F tests 
available for this problem. We can, however, obtain pseudo- F tests as discussed 
in Section 5.5. In the following we illustrate the pseudo-F test for testing the 
hypothesis H/':02 = 0 versus Hf: 02 > 0. 


Qa 


From Table 5.20, we obtain 


E(MSaz) + E(MSac) — E(MSz) = 0; + 3034 + 303, 


which is precisely equal to E(MS,) when o2 = 0. Hence, the desired test 
statistic 1s 


MS, 3.444 


= —  — —___—_ = § 1], 
MSap + MSac — MSe 0.278 + 0.528 — 0.132 


Fy 


which has an approximate F distribution with 2 and v, degrees of freedom, 
where v, (rounded to the nearest digit) 1s approximated by 


(0.278 + 0.528 — 0.132)? 


~ (0.278) . (0.528) , (-0.132 
er a rr rs 


a 


For a level of significance of 0.05, we find from Appendix Table V that 
F[2,5;0.95] =5.79. Since Fy =5.11 <5.79, we do not reject H¢ and con- 
clude that thermometers do not have a significant effect on melting point 
(p = 0.062). Similarly, approximate F tests can be readily performed for 
HH? and He yielding Fz = 0.59, Fé = 1.33, v» =5, and v, = 6. The resulting 
p-values for F; and Fé are 0.589 and 0.333, respectively, and we may conclude 
that weeks and analysts also do not have any significant effect on the melting 
point. 

Finally, we can also obtain the unbiased estimates of the variance compo- 
nents, giving 


6° = 0.132, 
.y 0.278 — 0.132 
6/5 = —————_ = 0.099, 
3 
0.528 — 0.132 
éj, = ———— = 0.132, 
0.417 — 0.132 
65, = —.—— = 0.095, 
3.444 — 0.278 — 0.528 + 0.132 
62 = SAREE 0278 = 0028 4 018? = 0.308, 
3x3 
0.333 — 0.278 — 0.417 + 0.132 
Ge U9 22 — NETO TUNE TONS = —().026, 


°B 3x3 


328 The Analysis of Variance 


TABLE 5.21 
Data on Diamond Pyramid Hardness 
Number of Dental Fillings Made from Two 


Alloys of Gold 
Alloy 
Gold Foil Goldent 
Method 1 2 3 1 2 3 
Dentist 792 772 ~=—-782 824 772 803 


1 

2 803. 752 = 715 803. 772 ~~ =707 
3 715 792 = 762 724 715 606 
4 673 657 690 946 743 245 
5 634 649 724 715) 724 = = 627 


Source: Halvorsen (1991, p. 145). Used with permission. 


and 


1.083 — 0.528 — 0.417 + 0.132 
67 = BOGS T B0E6 TAT Te = (2.030. 
y 3x3 


The results on variance components estimates are consistent with those on 
tests of hypotheses. It should, however, be noted that although certain variance 
components seem to be relatively large, a rather small value for the error degrees 
of freedom makes the corresponding F test quite insensitive. 


5.16 WORKED EXAMPLE FOR MODEL III 


The following example is based on an experiment in dentistry designed to 
study the hardness of gold filling material. The data are taken from Halvorsen 
(1991, p. 145). Five dentists were asked to prepare the six types of gold fill- 
ing material, sintered at the three temperatures, and using each of the three 
methods of condensation. The data in Table 5.21 represent a subset of the orig- 
inal data involving only two alloys of gold filling material and sintered at a 
single temperature. 

The data in Table 5.21 can be regarded as a three-way classification with one 
observation per cell. Note that the alloy and method are fixed effects and the 
dentist is a random effect. The mathematical model for this experiment would 
be 


Vijk = +a; + Bj + VR + (OB); + (AY diz 
+ (BY) jx + eijk 


Three-Way and Higher-Order Crossed Classifications 329 


TABLE 5.22 
Sums over Dentists yj. 
Method (/) 
Alloy (i) 1 2 3 Yi. 


Gold Foil 3,617. 3,622. 3,673—«-:10,912 

Goldent 4,012 3,726 +~—-2,988 ~—-10,726 

yj 7,629 7,348 6661 _ y,,, 
21,638 


where j1 is the general mean, a; is the effect of the i-th alloy, B; is the effect 
of the j-th method, is the effect of the k-th dentist, (@B);; 1s the interac- 
tion of the i-th alloy with the j-th method, (a@y)j, is the interaction of the i-th 
alloy with the k-th dentist, (By) ;, is the interaction of the j-th method with 
the k-th dentist, and e;;, is the customary error term. In order to estimate the 
error variance, it is assumed that alloy x method x dentist interaction is zero. 
Furthermore, it is assumed that the a;’s, 8;’s, and (wB);;’s are constants subject 
to the restrictions: 


2 3 
i=1 j=1 
3 


2 
> (@B)ij = >) @B)ij = 0; 
i=] 


j=l 


and y;.’s, (@y)ix’s, (BY) jx’s, and e;;,’s are independently and normally dis- 
tributed with mean zero and variances o/, OL, Op, and a7, respectively; and 
subject to the restrictions 


2 3 
YS @v ik = > BY) =9 (kK = 1,2,3,4,5). 
i=l j=! 


The first step in the analysis of variance computations consists of forming 
sums over every index and every combination of indices. Thus, we sum over 
dentists (kK) to obtain an alloy (i) x method (/) table containing y;;. = yy Vijk 
(see Table 5.22). We then sum this table over methods (j) to obtain the alloy 
totals (y;..) and again over alloys (i) to obtain method totals (y,;.). Now, the 
sum of the alloy totals is equal to the sum of the method totals, which gives the 
grand total (y...). Finally, Tables 5.23 and 5.24 are obtained in a similar way. 

With these preliminary results, the subsequent analysis of variance calcu- 
lations are rather straightforward. Thus, from the computational formulae of 


330 The Analysis of Variance 


TABLE 5.23 
Sums over Methods y; 


Alloy (i) 
Dentist (kK) Gold Foil Goldent Yk 


1 2,346 2,399 4,745 
2 2,270 2,282 4,552 
3 2,269 2,045 4,314 
4 2,020 1,934 3,954 
5 2,007 2,066 4,073 
Yj. 10,912 10,776 y... 
21,638 
TABLE 5.24 
Sums over Alloys y jx 
Method (j) 
Dentist (k) 1 2 3 Yk 
1 1616 1,544 = 1,585 4,745 
2 1606 1,524 1,422 4,552 
3 1,439 1,507 = 1,368 4,314 
4 1,619 1,400 935 3,954 
5 1,349 1,373 1,351 4,073 
yj. 7,629 7,348 6,661 Y... 
21,638 
Section 5.10, we have 
21,638) 
SS; = (792)? + (803)? +--+. + (627) — (21,098) 
2x3x5 
= 15,982,022 — 15,606,768.133 
= 375,253.867, 
Ss, = (10,912)* + (10,726)? (21,638)? 
A’ 3x5 2x3x5 
= 15,607,921.333 — 15,606,768.133 
= 1,153.200, 
5s, — (7,629)* + (7,348)? + (6,661)? (21,638)? 
a 2x5 2x3x5 


= 15,656,366.600 — 15,606,768.133 
= 49,598.467, 


Three-Way and Higher-Order Crossed Classifications 331 


_ (4,745)° + (4,552)? + --- + (4,073)? (21,638) 


SSc _ oes 
2x3 2x3x5 
= 15,678,295 .000 — 15,606,768.133 
= 71,526.867, 
3,617)? + (3,622)? + ---+ (2,988)? 21,638)" 
SSap = (O17) + B,022)" FF 988) C1038) ogg gg, 
5 2x3x5 
= 15,719,973.200 — 15,606,768.133 — 1,153.200 — 49,598.467 
= 62,453.400, 
2,346)" + (2,270)? + --- + (2,066)" 21,638)" 
55 go (2346 + 2,270)? +--+ 2,066)? 21,6387 egg 
3 2x3x5 
= 15,688,962.667 — 15,606,768.133 — 1,153.200 — 71,526.867 
= 9,514.467, 
1,616)? + (1,606)? + ---+(1,351)* (21,638) 
55, — (1816)? + 1,606)" +--+ (1,351)? 21,638) oe 
2 2x3x5 
= 15,815,112.000 — 15,606,768.133 — 49,598.467 — 71,526.867 
= 8§7,218.533. 


Finally, by subtraction, the error (three-way interaction) sum of squares is given 
by 


SSz- = 375,253.867 — 1,153.200 — 49,598.467 — 71,526.867 — 62,453.400 


— 9,514.467 — 87,218.533 
= 93,789.933. 


These results along with the remaining calculations are shown in Table 5.25. 

From Table 5.25 it is clear that all the two-factor interactions should be 
tested against the error term. It is immediately evident that the alloy x dentist 
and method x dentist interactions give F ratio values less than one and are 
clearly nonsignificant. Also, the alloy x method has an F value of 2.66, which 
again fails to reach the 5 percent level of significance (p = 0.130). The method 
main effect, when tested against the method x dentist interaction, has an F 
value of 2.27 which is also nonsignificant (p = 0.165). The alloy main effect, 
tested against the alloy x dentist interaction, gives an F ratio of less than 1 and 
is clearly nonsignificant (p = 0.523). The dentist effect, tested against error, 
has an F ratio of 1.53 which is again nonsignificant (p = 0.282). 

One can also obtain unbiased estimates of the variance components o?, py 


oz, and oY, yielding 
6° = 11,723.617, 


Opy = 


] 
G2 = 5 (10,902,317 — 11,723.617) = —410.650, 


The Analysis of Variance 


332 


Ors 0 
0£6°0 


Ot1 0 
C8¢C 0 


S9l'0 


ecS 0 


anjea-d 


£60 
0c'0 


99°C 


ts’ 


LOC 


60 


anyen J 


a 


70 
A 
907 + 20 
An a 
roe + 20 
j=f |=! 
(I - = — 7) 
hood qo + 20 
(6 
Ao x a 
Sie 
[ ~ Ag 2 
7d « + 79 + 7° 


aaenbs ueaw 
payoedx] 


LID ECL 
LI€'206 01 
LI9'8LE'7C 


OOL'9T7'IE 


LIL 188 LI 


bET 66L PZ 


O07 EST'T 


aaenbs 
ureaw 


LOBEST SLE 


€£6 88L £6 
CEC 8ITLB 
LOv'vIS'6 


OOF ESr'79 


LOVOCS IL 


LOV'86S 6b 


O07 EST I 


sasenbs 
jo wing 


6¢ 


wopaal4 
jo saaisaq 


[BIOL 


(isnueg x poyjeyy x AOjTY) 
JOLY 


sIJUaG] X poya|] 


jsNuaq x AO|lV 


pomp x AOU 


sUNG 


pomraw 


AOIV 


UOI}ELIeA 
jo a01Nn0S 


LZ°S 9IQU] JO B}JEG SSOUPARH PjO*) IY} 40} DOURLIPA JO sisAjeuy 


SCS 41aVL 


Three-Way and Higher-Order Crossed Classifications 333 


> 
nN 
— 


(2,378.617 — 11,723.617) = —3,115.000, 


and 
6° = gil 7 881.717 — 11,723.617) = 1,026.350. 


The negative estimates are probably an indication that the corresponding vari- 
ance components may be zero. The point estimates of variance components 
are consistent with the results on tests of hypotheses. Finally, the confidence 
limits on contrasts for the fixed effects can be constructed in the usual way. For 
example, to obtain 95 percent confidence limits for the difference between the 
two alloy effects, we have 


— 10,912 10,726 


yo—-y = — —- —— = 12.400, 
Yi. ~ ¥2.. 3x5 3x5 
2 2 
Var(¥1.. — V2.) = x5 MSac = Zc 5 (2378.617) = 317.149, 


and 
t[4,0.975] = 2.776. 
So the desired confidence limits are 


12.400 + 2.776V 317.149 = (—37.037, 61.837). 


One can similarly obtain confidence limits for the difference between any two 
method effects based on the t test. However, since there may be more than one 
comparison of interest, the multiple comparison techniques should generally 
be preferred. 


5.17 USE OF STATISTICAL COMPUTING PACKAGES 


Three-way and higher-order factorial models can be analyzed using either SAS 
ANOVA or GLM procedures. For a balanced design, the recommended pro- 
cedure is ANOVA and for the unbalanced design, GLM must be used. The 
random and mixed model analyses can be handled by the use of RANDOM 
and TEST options. Approximate or pseudo- F tests can be carried out via GLM 
using the Satterthwaite procedure. For the estimation of variance components, 
PROC MIXED or VARCOMP must be used. For instructions regarding SAS 
commands see Section 11.1. 

Among the SPSS procedures, either ANOVA or MANOVA could be used for 
a fixed effects analysis although ANOVA would be simpler. For the analyses 
involving random and mixed effects models, MANOVA or GLM must be used. 
For the estimation of variance components, VARCOMP is the procedure of 
choice. For instructions regarding SPSS commands see Section 11.2. 


334 The Analysis of Variance 


The BMDP programs described in Section 4.15 are also adequate for ana- 
lyzing three-way and higher-order factorial models. No new problems arise for 
analyses involving higher-order factorials. 


5.18 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the applications of statistical packages to per- 
form three-way analysis of variance for the data sets of examples presented 
in Sections 5.14 through 5.16. Figures 5.1 through 5.3 illustrate the program 
instructions and the output results for analyzing data in Tables 5.10, 5.16, and 
5.21 using SAS ANOVA/GLM, SPSS MANOVA/GLM, and BMDP 2V/8V 


DATA ELECCHRM; The SAS System 

INPUT TEMP TIME DEGR Analysis of Variance Procedure 

RESIST; Dependent Variable: RESIST 

DATALINES; Sum of Mean 

1131#417.9 Source DF Squares Square F Value Pr > F 


ee ee . Model 17 508.40500 29.90618 101.89 0.0001 
3.3 2 26.9; Error 54 15.85000 0.29352 
PROC ANOVA; Corrected Total 71 524.25500 
CLASSES TEMP TIME DEGR; 
MODEL RESIST=TEMP TIME R-Square C.V. Root MSE RESIST Mean 
DEGR TEMP* TIME TEMP* DEGR 0.969767 2.417732 0.5418 22.408 
TIME* DEGR 
TEMP* TIME* DEGR; Source DF Anova SS Mean Square F Value 
RUN; TEMP 2 410.69250 205.34625 699. 
CLASS LEVELS VALUES } TIME 2 80.54083 40.27042 137. 
TEMP 1 2 3 | DEGR 1 0.46722 -46722 1. 
TIME 12 3 TEMP* TIME 4 14.81417 3.70354 12. 
DEGR 1 2 TEMP* DEGR 2 0.31361 0.15681 
2 0.70194 0 
4 0.87472 0 


NUMBER OF OBS. IN DATA TIME* DEGR .35097 
SET=72 TEMP* TIME* DEGR -21868 


(i) SAS application: SAS ANOVA instructions and output for the three-way fixed effects 
analysis of variance. 


DATA LIST Analysis of Variance-Design 1 
/TEMP 1 TIME 3 DEGR 5 
RESIST 7-10(1). Tests of Significance for RESIST using UNIQUE sums of squares 
BEGIN 
Source of Variation Ss 


WITHIN CELLS 
TEMP 
TIME 


DEGR 
TEMP TIME 


TEMP DEGR 

- - . TIME DEGR 

3 . TEMP TIME BY DEGR . 
END . (Model) 508. 
MANOVA RESIST BY (Total) 524. 
TEMP(1,3) TIME(1,3) 

DEGR(1,2). R-Squared 


ee 


(ii) SPSS application: SPSS MANOVA instructions and output for the three-way fixed 
effects analysis of variance. 


FIGURE 5.1 Program Instructions and Output for the Three-Way Fixed Effects 
Analysis of Variance: Data on Average Resistivities (in m-ohms/cm*) for Elec- 
trolytic Chromium Plate Example (Table 5.10). 


Three-Way and Higher-Order Crossed Classifications 335 


/ INPUT FILE='C:\ SAHAI BMDP2V ~ ANALYSIS OF VARIANCE AND COVARIANCE WITH 
\ TEXTO\ EJE12.TXT'. REPEATED MEASURES Release: 7.0 (BMDP/DYNAMIC) 


FORMAT=FREE. 

VARIABLES=4. ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 
/VARIABLE NAMES=TE,TI,DEG,RES. |THE TRIALS ARE REPRESENTED BY THE VARIABLES:RESIST 
/GROUP VARIABLE=TE,TI, DEG. 

CODES (TE) =1,2,3. SOURCE SUM OF D.F. MEAN F TAIL 

NAMES (TE) =F1,F2,F3. SQUARES SQUARE PROB. 


36153.60507 123173. 0.0000 
205.34623 699. .0000 


CODES (TI) =1,2,3. MEAN 36153.60507 1 

NAMES (TI) =H4,H8,H12. 410.69245 2 

CODES (DEGR) =1,2. 80.54083 2 40.27041 137. -0000 

NAMES (DEGR)=YES,NO. 0.46722 1 0.46722 1. -2125 
/DESIGN DEPENDENT=RESIST. 14.81417 4 3.70354 12. .0000 
/END .31361 2 0.15681 . 5892 
1 2121#17.9 - 70194 2 0.35097 . 3104 
see . -87472 4 0.21868 . .5656 
3.3 2 26.9 85000 54 0.29352 


(iii) BMDP application: BMDP 2V instructions and output for the three-way fixed 
effects analysis of variance. 


FIGURE 5.1 (continued) 


DATA MELTPOIN; The SAS System 
INPUT THERMOM WEEK General Linear Models Procedure 
ANALYST MELTINGP; Dependent Variable: MELTINGP 
DATALINES; Sum of Mean 
111 #174. Source DF Squares Square F Value Pr> F 
173. Model 18 14.611111 0.811728 6.15 0.0066 
173. Error 8 1.055556 0.131944 
173. Corrected Total 26 15.666667 
173. R-Square C.V. Root MSE MELTINGP Mean 
173. 0.932624 0.210101 0.3632 172.89 
174. Source D Type III SS Mean Square F Value Pr > F 
173. THERMOM 6.8888889 3.4444444 26.11 0.0003 
173. WEEK 0.6666667 0.3333333 53 0.1411 
173. ANALYST 2.1666667 1.0833333 21 0.0115 
172. THERMOM* WEEK 21.1111111 0.2777778 11 0.1719 
173. THERMOM* ANALYST 2.1111111 0.5277778 .00 0.0453 
173. WEEK* ANALYST 1,6666667 0.4166667 16 0.0780 
173. Source Type III Expected Mean Square 

. THERMOM Var(Error) + 3 Var (THERMOM* ANALYST) 
172. + 3 Var (THERMOM* WEEK) + 9 Var (THERMOM) 
; WEEK Var(Error) + 3 Var(WEEK* ANALYST) 
PROC GLM; + 3 Var (THERMOM* WEEK) + 9 Var (WEEK) 
CLASSES THERMOM WEEK ANALYST Var(Error) + 3 Var (WEEK* ANALYST) 
ANALYST; + 3 Var (THERMOM* ANALYST) + 9 Var(ANALYST) 
MODEL MELTINGP=THERMOM | THERMOM* WEEK Var(Error) + 3 Var(THERMOM* WEEK) 
WEEK ANALYST THERMOM* THERMOM* ANALYST Var(Error) + 3 Var(THERMOM* ANALYST) 
WEEK THERMOM* ANALYST WEEK* ANALYST Var(Error) + 3 Var(WEEK* ANALYST) 
WEEK* ANALYST; Source: THERMOM Error:MS (THERMOM* WEEK) +MS (THERMOM* ANALYST) -MS (Err) 
RANDOM THERMOM WEEK Denominator Denominator 
ANALYST THERMOM* WEEK DF Type III MS DF MS F Value Pr > F 
THERMOM* ANALYST 2 3.4444444444 4.98 0.6736111111 5.1134 0.0621 
WEEK* ANALYST /TEST; Source: WEEK Error: MS(THERMOM* WEEK) + MS (WEEK* ANALYST) -MS (Err) 
RUN; Denominator Denominator 
CLASS LEVELS VALUES DF Type III MS DF MS F Value Pr > F 
THERMON 3 12 3 2 0.3333333333 4.88 0.5625 0.5926 0.5883 
WEEK 3 12 3 Source:ANALYST Error:MS (THERMOM* ANALYST) +MS (WEEK* ANALYST) -MS (Err) 
ANALYST 3 12 3 Denominator Denominator 
NUMBER OF OBS. IN DATA DF Type III MS DF MS F Value Pr > F 
SET=27 1.0833333333 5.73 0.8125 1.3333 0.3346 


NO MNNDN PPP RE Re ee 
NNR RP RP WWW NNDB 
Nr WHO WY WDY eR WP 
omoooconunoaonnond 


Ww 
Ww 
om 


(i) SAS application: SAS GLM instructions and output for the three-way random effects 
analysis of variance. 


FIGURE 5.2 Program Instructions and Output for the Three-Way Random Ef- 
fects Analysis of Variance: Data on the Melting Points of a Homogeneous Sample 
of Hydroquinone (Table 5.16). 


336 The Analysis of Variance 


DATA. LIST 
/THERMOM 1 
WEEK 3 
ANALYST 5 
MELTINGP 7-11 (1) 
BEGIN DATA. 
1211174. 
173. 
173, 
173. 
173. 
173. 
174. 
173. 
173. 
173. 
172. 
173. 
173. 


MELTINGP 
Sig. 
-062 


Dependent Variable: 
Mean Square F 
3.444 5.113 
-674 (a) 
.333 
-562(b) 
.083 
-812(c) 
278 
- 132 (d) 
-528 
.132(d) 


Tests of Between-Subjects Effects 
Source Type III SS 
THERMOM 6.889 
3.355 
667 
2.744 
2.167 
4.655 
1.111 
1.056 
2.111 
1.056 
1.667 417 
1.056 -132(d) 
b MS(T*W)+MS (W*A)-1.000 MS(E) 


Hypothesis 
Error 
Hypothesis 
Error 
Hypothesis 
Error 
Hypothesis 
Error 
Hypothesis 
Error 


WEEK -593 


ANALYST 


THERMOM* 
WEEK 
THERMOM* 
ANALYST 

WEEK* Hypothesis 
ANALYST Error 

a MS (T*W)+MS (T* A) -MS (E) 
-MS(E) d MS(E) 


-158 .078 


MS (T* A) +MS (W* A) 


Expected Mean Squares (a,b) 
Variance Component 
Var(W) Var(A) Var (T*W) 


MNMoooonrNnoonnoanad 


1 
1 
2 
2 
2 
3 
3 
3 
1 
1 
1 
2 


MPWNYrR WN WY wr 


Var(T*A) Var(WtA) Var(E) 
9,000 9.000 3.000 3.000 3.000 1.000 
.000 -000 3.000 3.000 .000 .000 
9.000 -000 3.000 000 3.000 .000 
.000 9.000 .000 3.000 3.000 .000 
.000 .000 3.000 .000 -000 .000 
-000 .000 .000 3.000 000 .000 
.000 .000 000 .000 3.000 .000 
.000 .000 .000 .000 -000 .000 
a For each source, the expected mean square equals the sum of the 
coefficients in the cells times the variance components, plus a 
quadratic term involving effects in the Quadratic Term cell. b Expected 
Mean Squares are based on the Type III Sums of Squares. 


Var (T) 

9.000 

9.000 
.000 
.000 
.000 

- 000 
.000 
.000 


source 
Intercept 
THERMOM 

WEEK 

ANALYST 
THERMOM* WEEK 
THERMOM* ANALYST 
WEEK* ANALYST 
Error 


(oa) 


3.3 172. 
END DATA. 
GLM MELTINGP BY 
THERMOM WEEK 
ANALYST 
/DESIGN THERMOM 

WEEK ANALYST 
THERMOM* WEEK 
THERMOM* ANALYST 
WEEK* ANALYST 
/RANDOM THERMOM 
WEEK ANALYST. 


(ii) SPSS application: SPSS GLM instructions and output for the three-way random 
effects analysis of variance. 


FILE='C:\ SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\ EJE13.TXT’. - EQUAL CELL SIZ=ZS Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
VARIABLES=3. SOURCE ERROR SUM OF MEAN F 
NAMES=A1,A2,A3. TERM SQUARES SQUARE 


/INPUT 


/VARIABLE 


/ DESIGN NAMES=T,W,A. 
LEVELS=3, 3,3. 


RANDOM=T,W,A. 


MODEL='T, W, A’. 


/END 
174. 
173. 


173. 
173. 


173. 
173. 


MEAN 
THERMOM 
WEEK 
ANALYST 
TW 

TA 

WA 


-0704533E+5 
-88888E9E+0 
- 6666667E+0 
.1666667E+0 
-1111111E+0 
~1111111E+0 
- 6666667E+0 


-0704533E+5 
-4444444E+0 
.3333333E+0 
-0833333E+0 
-2777778E+0 
-5277778E+0 
-4166667E+0 


2.11 
4.00 
3.16 


0.1719 
0.0453 
0.0780 


Orn n fF WN eB 
PrRPNrFNOH @& 
Or PB PNMNN & 
oO CO 0rF0W 


173. TWA .0555556E+0 .1319444E+0 
173. 
173. 
172. 
173. 


173. 


174. 
173. 
173. 
173. 
171. 


0 
0 
173.5 
172.0 
173.0 SOURCE EXPECTED MEAN 
173.5 SQUARE 
0 27(1)+9(2)+9(3)+9(4)+3(5)+3(6)+3(7)+8 29890. 
0 9(2)+3(5)+3(6)+(8) 0. 
5 9(3)+3(5)+3(7) + (8) -0.02546 
F 9 (4) +3(6)+3(7) + (8) 0.03009 
3(5)+ (8) .04861 
3(6)+(8) -13194 
3(7) + (8) -09491 


(8) 13194 


ESTIMATES OF 
VARIANCE COMPONENTS 
42824 
30787 


MEAN 
THERMOM 
WEEK 
ANALYST 
TW 

TA 

WA 


171. 
172. 172. 
173.0 171. 172. 
ANALYSIS OF VARIANCE DESIGN 
INDEX T 

NUMBER OF LEVELS 3 
POPULATION SIZE INF INF INF 
MODEL T, W, A 


MOnonomnmn ad 
9OoOoMmNnNoOoom 


Oo 
mn 


DAIKDNPBWNHE 


(iii) BMDP application: BMDP 8V instructions and output for the three-way random 
effects analysis of variance. 


FIGURE 5.2 (continued) 


procedures. The typical output provides the data format listed at the top, all 
cell means, and the entries of the analysis of variance table. It should be no- 
ticed that in each case the results are the same as those provided using manual 
computations in Sections 5.14 through 5.16. However, note that certain tests of 
significance in a mixed model may differ from one program to the other since 
they make different model assumptions. 


Three-Way and Higher-Order Crossed Classifications 


DATA DIAMOND; 

INPUT ALLOY METHOD 
DENTIST NUMBER; 
DATALINES; 

111 792 

112 803 

23 5 627 

; 

PROC GLM; 

CLASSES ALLOY METHOD 
DENTIST; 

MODEL NUMBER=ALLOY 
METHOD DENTIST 
ALLOY* METHOD 

ALLOY* DENTIST 
METHO D* DENTIST; 
RANDOM DENTIST 
ALLOY* DENTIST 
METHOD* DENTIST; 

TEST H=ALLOY 
E=ALLOY* DENTIST; 
TEST H=METHOD 
E=METHOD* DENTIST; 
RUN; 
CLASS 
ALLOY 2 
METHOD 3 
DENTIST 5 
NUMBER OF O38S. 
SET=30 


LEVELS VALUES 
1 2 
123 


12345 
IN DATA 


337 


The SAS System 
General Linear Models Procedure 
Dependent Variable: NUMBER 
Sum of 
Squares 
281464.93333 
93788 .93333 
375253.86667 
R-Square C.V. Root MSE 
0.750065 15.011875 108.27565 
Source DF Type III SS Mean Square F Value Pr > F 
ALLOY 1 1153.200 1153.200 - 0.7618 
METHOD 49598.467 24799.233 1830 
DENTIST 71526.867 17881.717 -2829 
ALLO Y* METHOD 62453.400 31226.700 .1298 
ALLOY* DENTIST 9514.467 2378.617 -9297 
METHOD* DENTIST 8 87218.533 10902.317 -5396 
Source Type III Expected Mean Square 
ALLOY Var (Error)+3 Var (ALLOY* DENTIST) +Q(ALLOY, ALLOY* METHOD) 
METHOD Var (Error)+2 Var (METHOD* DENTIST) +Q (METHOD, ALLO Y* METHOD) 
DENTIST Var(Error)+2 Var (METHOD* DENTIST)+3 Var (ALLOY* DENTIST) 
+ 6 Var (DENTIST) 
Var(Error) + Q(ALLOY* METHOD) 
ALLO Y* DENTIST Var(Error) + 3 Var(ALLOY* DENTIST) 
METHOD* DENTIST Var(Error) + 2 Var(METHOD* DENTIST) 
Tests of Hypotheses using the Type III MS for ALLOY*DENTIST as 
an error term 
Source DF 
ALLOY 1 
Source DF 
METHOD 2 


Mean 

Square 
13403.09206 
11723.61667 


Pr > F 
0.4473 


F Value 
1.14 


Source DF 
Model 21 
Error 8 
Corrected Total 29 
NUMBER Mean 
721.26666667 


2 
4 
2 
4 


ALLO Y* METHOD 


Type III SS Mean Square F Value Pr > F 
1153.200 1153.200 0.48 0.5246 
Type III SS Mean Square F Value Pr > F 

49598.467 24799.233 2.27 0.1651 


(i) SAS application: SAS GLM instructions and output for the three-way mixed effects 


analysis of variance. 


DATA LIST 
/ALLOY 1 
METHOD 3 
DENTIST 5 
NUMBER 7-9. 
BEGIN DATA. 
11 792. 


Tests of Between-Subjects Effects 


Source 
ALLOY 


METHOD 


Dependent Variable: NUMBER 
Type III Ss df 
1153.200 1 
9514.467 
49598 .467 2 
87218 .533 


Sig 
-525 


Mean Square F 
1153.200 -485 
2378.617 (a) 

24799.233 2. 
10902 .317 (b) 


Hypothesis 
Error 
Hypothesis 
Error 


275 


NP PPP RP PRP RPP PPP 


2 


WW WN MMM DN FR Fe 


3 
END 


MN rR On &® WDNR Oe WW MY 


5 


803. 
715. 
673. 
634. 
772. 
752. 
792. 
657. 
649. 
782. 
715. 
762. 


627. 


DATA. 


NUMBER BY 
ALLOY METHOD 


DENTIST 


oooooc;cooo0o0o0oao 


[o) 


DENTIST Hypothesis 
Error 
Hypothesis 
Error 
Hypothesis 
DENTIST Error 
METHOD* Hypothesis 
DENTIST Error 
a MS(A*D) b MS(M*D) 


ALLO Y* 
METHOD 
ALLO Y* 


Source 

ALLOY 

METHOD 

DENTIST 
ALLO Y* METHCD 
ALLOY* DENTIST 
METHOD* DENTIST 


-000 
.000 
6.000 
000 
.000 
000 


Var (D) 


71526.867 
112.902 
62453.400 
93788.933 
9514.467 
93788.933 
87218.533 
93788 .933 


7.250E-02 


c MS (A*D)+MS (M* D) -MS (E) 


17881. 
1557. 
31226. 
11723. 
2378. 
11723. 
10902. 
11723. 
d MS(E) 


Expected Mean Squares (a,b) 
Variance Component 


Var (A* D} 
3.000 

- 000 
3.9000 

. 000 
3.000 

. 000 2.000 


Var (M* D) 
.000 

2.000 

2.000 
.000 
.000 


Var (Error) 


.000 
000 
-900 
000 
.000 
.000 


Vi7 
317 (c) 
700 
617 (d) 
617 
617 (d) 
317 
617 (d) 


11.4 


2. 


Quadratic Term 


Alloy 
Method 


Alloy* Method 


/DESIGN ALLOY 
METHOD DENTIST 
ALLO Y* METHOD 
ALLOY* DENTIST 
METHOD* DENTIST 

/RANDOM DENTIST. 


Error 
a For 
coefficients 
term 


.000 .000 .000 .000 
source, the expected mean square equals 
in the cells times the variance components, 
effects in the Quadratic Term cell. 


the sum of the 
plus a quadratic 
b Expected Mean 


each 


involving 


Squares are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the three-way mixed 
effects analysis of variance. 


FIGURE 5.3. Program Instructions and Output for the Three-Way Mixed Effects 
Analysis of Variance: Data on Diamond Pyramid Hardness Number of Dental 
Fillings Made from Two Alloys of Gold (Table 5.21). 


338 The Analysis of Variance 


/ INPUT FILE='C:\ SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\ EJE14.TXT'. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
VARIABLES=5. SOURCE ERROR SUM OF D.F. MEAN 

/VARIABLE NAMES=D1,D2,D3, TERM SQUARES SQUARE 

D4,D5. MEAN DENTIST 15606768. 15606768. 

/DESIGN NAMES=ALLOY, METHOD, ALLOY AD 1153. 

DENTIST. METHOD MD 24799. 

LEVELS=2,3,5. DENTIST 17881. 
RANDOM=DENTIST. 31226. 
FIXED=ALLOY,METHOD. 2378. 

MODEL='A, M, D'. 10902. 

11723. 


AMD 


DTIDUNBWNPE 
woununs. OWN PF 
Oo os yp) ®NP LP 


803 715 673 634 

752 792 657 649 SOURCE EXPECTED MEAN ESTIMATES OF 

715 762 690 724 SQUARE VARIANCE COMPONENTS 

803 724 946 715 MEAN 30(1)+6(4) 519629.54722 

772 715 743 724 ALLOY 15 (2)+3(6) -81.69444 

707 606 245 627 METHOD 10 (3)+2(7) 1389.69167 
ANALYSIS OF VARIANCE DESIGN DENTIST 6(4) 2980.28611 
INDEX | A M OD 5(5)+(8) 3900.61667 
NUMBER OF LEVELS 2 3 5 3 (6) 792.87222 
POPULATION SIZE 2 3 INF 2(7) 5451.15833 
MODEL A, M, D (8) 11723.61667 


Own OS WNP 


(iii) BMDP application: BMDP 8V instructions and output for the three-way mixed 
effects analysis of variance. 


FIGURE 5.3 (continued) 


EXERCISES 


1. A study was performed on 18 oxides and 18 hydroxides to compare 
the effect of corrosion on three metal products. Six samples from 
each oxide and each hydroxide of each metal product were assigned 
at random into six groups of three each. Three groups from each oxide 
and hydroxide had density measurements taken after a test period of 
30 hours and the other three groups were measured after a test period 
of 60 hours. The relevant data are given as follows. 


Metal Products 


Steel Copper Zinc 

Corrosive Corrosive Corrosive 

. Element Element Element 
Test Period —_—___— —_— —_—___ 
(hrs) O2 OH O2 OH Or OH 
30 159 =: 134 168 143 139 122 


135. 154 139-145 134 127 
125 148 167 169 165 117 
60 — 152 ~=«:120 112 = 152 126 113 
142 = 163 152.0 127 151 146 
124 150 117) s151 138 126 


(a) Describe the mathematical model and the assumptions for the 
experiment. 
(b) Analyze the data and report the analysis of variance table. 


Three-Way and Higher-Order Crossed Classifications 339 


(c) Test whether there are differences in the effect of corrosion 
among the three metal products. Use a = 0.05. 

(d) Test whether there are differences in the effect of corrosion be- 
tween the two test periods. Use a = 0.05. 

(e) Test whether there are differences in the effect of corrosion be- 
tween oxides and hydroxides. Use a = 0.05. 

(f) Test the significance of different interaction effects. Usea = 
0.05. 

2. The percentage of silicon carbide (SiC) concentration in an aluminum- 
silicon carbide (Al-SiC) composite, the fusion temperature, and the 
casting time of Al-SiC are being investigated for their effects on the 
strength of Al-SiC. Three levels of SiC concentration, three levels of 
fusion temperature, and two casting times are selected. A factorial 
experiment with two replicates was conducted and the following data 
in certain standard units (Mpa) were obtained. 


Casting Time 
1 2 


oO oO 
Silicon Carbide Temperature (°C) Temperature (°C) 


Concentration (%) 1300 1400 1500 1300 1400 #41500 


10 186.6 187.7 189.8 188.4 189.6 190.6 
186.0 186.0 189.4 188.6 190.4 190.9 
15 188.5 186.0 188.4 187.5 188.7 189.6 
187.2 186.9 187.6 188.1 188.0 189.0 
20 187.5 185.6 187.4 187.6 187.0 188.5 


186.6 186.2 188.1 189.4 187.8 189.8 


(a) State the model and the assumptions for the experiment. All 
factors may be regarded as fixed effects. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the strength of the Al-SiC 
composite due to the casting time. Use a = 0.05. 

(d) Test whether there are differences in the strength of the Al-SiC 
composite due to the temperature. Use a = 0.05. 

(e) Test whether there are differences in the strength of the Al-SiC 
composite due to the percentage of silicon carbide concentration. 
Use a = 0.05. 

(f) Test the significance of different interaction effects. Use a = 0.05. 

3. The production management department of a textile factory is study- 
ing the effect of several factors on the color of garments used to 
manufacture ladies’ dresses. Three machinists, three production cir- 
cuit times, and two relative humidities were selected and three small 
samples of garments were colored under each set of conditions. The 
completed garment was compared to a standard and a range of scale 


340 


The Analysis of Variance 


was assigned. The data in certain standard units are given as follows. 


Relative Humidity 
Low High 
; Machinist Machinist 
Production a 
Circuit Time 1 2 3 1 2 3 


40 28 32 36 29 43 =~ = 39 
29 33 370 «628 )~—S 41s 4d 
30 «631060 334 3B 40—s«4 
50 41 39 38 42 39 39 
40 43 39 44 43 41 
41 44 40 40 41 = 36 
60 33 40 31 #31 41 33 
29 40 32 34 42 = 31 
32 39 «©6300 0330 3929 


(a) State the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in coloring due to relative 
humidity. Use a = 0.05. 

(d) Test whether there are differences in coloring due to machinist. 
Use a = 0.05. 

(e) Test whether there are differences in coloring due to production 
circuit time. Use a = 0.05. 

(f) Test the significance of different interaction effects. Use a = 0.05. 

(g) Do exact tests exist for all effects? If not, use the pseudo- F tests 
discussed in this chapter. 


. Consider the following data from a factorial experiment involving 


three factors A, B, and C, where all factors are considered as having 
fixed effects. 


C C C3 


A, 20.1 19.7 208 21.7) 193 183 20.7 204 24.1 
334 185 14.7 203 178 167 19.2 186 18.6 
27.2 173 185 194 181 15.2 181 175 16.2 
Ay 264 22.3 21.2 238 203 173 176 224 12.7 
195 204 196 224 22.1 185 19.1 207 163 
233 193 183 21.2 235 203 208 194 17.1 


(a) State the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Carry out tests of significance on all interactions. Use w = 0.05. 

(d) Carry out tests of significance on the main effects. Use aw = 0.05. 

(e) Give an illustration of how a significant interaction has masked 
the effect of factor C. 


Three-Way and Higher-Order Crossed Classifications 341 


5. Consider a factorial experiment involving three factors A, B, and C, 
and assume a three-way fixed effects model of the form 


Vijke = +0; + Bj + VE + (BY) jg + ijre 
(Gi = 1,2,3,4; 7 = 1,2:k = 1,2, 3). 


It is assumed that all other interactions are either nonexistent or neg- 
ligible. The relevant data are given as follows. 


B, Bo 
G a2 & G4 GQ & 


A; 50 45 48 55 42 42 
58 52 54 43 44 48 
Ap 47 38 42 38 38 48 
48 42 43 41 42 53 
Az 5.7 45 47 45 39 3.9 
58 47 55 47 45 46 


Ag 45 43 42 23 #38 45 
43 37 44 44 43 54 


(a) State the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform a test of significance on the B x C interaction. Use 
a = 0.05. 

(d) Perform tests of significance on the main effects, A, B, and C, 
using a pooled error mean square. Use a = 0.05. 

(e) Are two observations for each treatment combination sufficient 
if the power of the test for detecting differences among the levels 
of a factor C at the 0.05 level of significance is to be at least 0.8 
when y; = —0.2, y2 = 0.4, and y3 = —0.2? Use the same pooled 
estimate of o2 as obtained in the analysis of variance. 

6. Ostle (1952) discussed the results of an analysis of variance of a 
three-factor factorial design. The experiment consisted of determin- 
ing soluble matter in four extract solutions by pippetting in duplicate 
25, 50, and 100 ml volumes of solution into dishes. The solution was 
evaporated and weighed for residues. The experiment was replicated 
by repeating it on three days. The researcher is interested in only 
the four extracts and only the three volumes used 1n the experiment. 
However, the days are considered to be a random sample of days. 
The mathematical model for this experiment would be: 


i=1,2,3,4 

Vijge = Mta+By+y+(B)j+@yvn. J J=1,2,3 
+ (BY) jn + CQBY Dijk + Cijee k=1,2,3 
£=1,2 


b] 


where ju is the general mean, a; is the effect of the i-th extract, B; is 


342 The Analysis of Variance 


the effect of the j-th volume, y,; is the effect of the k-th day, (@B);; 
is the interaction of the i-th extract with the j-th volume, (ay);x 1S 
the interaction of the i-th extract with the k-th day, (By) j, 1s the 
interaction of the j-th volume with the k-th day, and (@By );;x 1s the 
interaction of the 7-th extract with the j-th volume and the k-th day, 
and e;;x¢ 1s the customary error term. Furthermore, it is assumed that 
the a;’s, B;’s, and (@B);;’s are constants subject to the restrictions: 


4 3 
>> a; = >- Bi; = 0, 
j=l 


i=1 
4 3 
>> @B)ij = >| @B)ij = 0; 
i=l j=l 


and the y,’s, (wy)ix’s, (BY) jx’S, and ejjxe’s are independently and 
normally distributed with mean zero and variances 0/7, oj, 03, and 
o7, respectively; and subject to the restrictions: 


4 3 
YS @y ik = >) By) jx =0 (kK =1,2,3). 


The analysis of variance computations of the data (not reported here) 
are carried out exactly as in Section 5.16 and the results are given as 


follows. 
Source of Degrees of Mean Expected 
Variation Freedom Square Mean Square 
2 2 3x3x2~ 
Extract 3 161.5964 of +3 x 209, + Gop da 
y) ) 4 x =~ 3 
Volume 2 0.02443 of +4 x 20%, + > LA 
j=l 
Day 2 0.07535 of +4x3x 20% 
Extract x Volume 6 0.00772. of + 202 By 
+a pion > > (oB);, 
Extract x Day 6 0.03959 of+3x 20%, a 
Volume x Day 4 0.01501 a? +4x 20%, 
Extract x Volume x Day 12 0.00654 of + 202 By 
Error 36 0.00565 «a2 
Total 71 


Source: Ostle (1952). Used with permission. 


Three-Way and Higher-Order Crossed Classifications 343 


(a) Test whether there are differences in soluble matter among dif- 
ferent extracts. Use a = 0.05. 

(b) Test whether there are differences in soluble matter among dif- 
ferent volumes. Use a = 0.05. 

(c) Test whether there are differences in soluble matter among dif- 
ferent days. Use a = 0.05. 

(d) Test for the following interaction effects: extract x volume, 
extract x day, volume x day, and extract x volume xday. Use 
a = 0.05. 

(e) Determine appropriate point and interval estimates for each of 
the variance components of the model. 

(f) Itis found that the effects due to day, extract x day, and volume x 
day are all significant; that is, the method is unreliable in that 
the differences among volumes will not be the same on different 
days. Give one possible explanation of the excessive variation 
among days. How might this difficulty be overcome? 

7. Damon and Harvey (1987, p. 316) reported data from an experiment 
with radishes involving a three-factor factorial design. There were 
two sources of nitrogen — ammonium sulfate and potassium nitrate, 
three levels of nitrogen, and two levels of treatments — nitrapyrin and 
no nitrapyrin. The following data are given where four observations 
are the fresh weights of the plants in grams/pot. 


Source of Nitrogen 
Ammonium Sulfate Potassium Nitrate 


Level Nitrapyrin)§ NoNitrapyrin§ Nitrapyrin § No Nitrapyrin 


1 14.3 17.5 17.6 13.9 
15.9 16.7 24.0 16.8 
14.8 15.7 13.5 17.3 
20.8 15.1 17.9 12.6 
2 37.5 39.6 43.3 45.8 
29.4 33.0 53.5 46.9 
33.8 52.8 49.3 48.0 
33.1 36.2 49.9 47.0 
3 41.4 52.5 59.8 77.8 
49.9 53.4 98.4 87.6 
43.2 51.7 79.4 83.4 
40.1 52.2 80.0 84.7 


Source: Damon and Harvey (1987, p. 316). Used with permission. 


(a) State the model and the assumptions for the experiment. Consider 
it to be a fixed effects model. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in weights due to source of 
nitrogen. Use a = 0.05. 


344 


The Analysis of Variance 


(d) Test whether there are differences in weights due to levels of 
nitrogen. Use a = 0.05. 

(e) Test whether there are differences in weights due to levels of 
treatment. Use a = 0.05. 

(f) Test the significance of different interaction effects. Usea = 
0.05. 


. Scheffé (1959, pp. 145-146) reported data from an experiment con- 


ducted by Otto Dykstra, Jr., at the Research Center of the General 
Food Corporation, to study variation in moisture content of a certain 
food product. A four-factor factorial design, involving three kinds of 
salt, three amounts of salt, two amounts of acid, and two types of 
additives, was used. The moisture content (in grams) of samples in 
the experimental stage were recorded and the data given as follows. 


Amount of Acid 
1 2 
Kind of | Amount Type of Additive Type of Additive4 

Salt of Salt 1 2 1 2 
1 1 8 5 8 4 
2 17 11 13 10 

3 22 16 20 15 

2 1 7 3 10 5 
2 26 17 24 19 

3 34 32 34 29 

3 1 10 5 9 4 
2 24 14 24 16 

3 39 33 36 34 


Source: Scheffé, (1959, p. 145). Used with permission. 


(a) State the model and the assumptions for the experiment. Consider 
it to be a fixed effects model. Since there is no replication, one 
must think what error term to use. 

(b) Analyze the data as the full factorial model and report the analysis 
of variance table. 

(c) Test whether there are differences in moisture content due to kind 
of salt. Use aw = 0.05. 

(d) Test whether there are differences in moisture content due to 
amount of salt. Use a = 0.05. 

(e) Test whether there are differences in moisture content due to 
amount of acid. Use a = 0.05. 

(f) Test whether there are differences in moisture content due to type 
of additives. Use a = 0.05. 

(g) Test the significance of different interaction effects. Usea = 
0.05. 


Three-Way and Higher-Order Crossed Classifications 


9. Davies (1956, p. 275) reported data from a 24 factorial experiment 
designed to investigate the effect of acid strength, time of reaction, 
amount of acid, and temperature of reaction on the yield of an isatin 
derivative. The following data are given where observations are the 
yield of an isatin derivative measured in gms/100 gm of base material. 


(a) 


Temperature of Reaction 
60°C 70°C 
Amount of Acid =Amount of Acid 


Acid Reaction 
Strength Time 35ml = =45 ml 35 ml 45 ml 
87% 15 min 6.08 6.31 6.79 6.77 
30 min 6.53 6.12 6.73 6.49 
93% 15 min 6.04 6.09 6.68 6.38 
30 min 6.43 6.36 6.08 6.23 


Source: Davies (1956, p. 275). Used with permission. 


State the model and the assumptions for the experiment. Consider 
it to be a fixed effects model. 


(b) Analyze the data assuming that three- and four-factor interactions 


(c) 
(d) 
(e) 
(f) 
(g) 


are negligible and report the analysis of variance table. (Davies 
stated that on technical grounds, the existence of three- and four- 
factor intereactions is unlikely.) 

Test whether there are differences in yield due to acid strength. 
Use w = 0.05. 

Test whether there are differences in yield due to reaction time. 
Use a = 0.05. 

Test whether there are differences in yield due to amount of acid. 
Use a = 0.05. 

Test whether there are differences in yield due to temperature of 
reaction. Usea = 0.05. 

Test the significance of two-factor interaction effects. Use a = 
0.05. 


345 


Two-Way Nested 
(Hierarchical) 
Classification 


6.0 PREVIEW 


In Chapters 3 through 5 we considered analysis of variance for experiments 
commonly referred to as crossed classifications. In a crossed-classification, 
data cells are formed by combining of each level of one factor with each level 
of every other factor. We now consider experiments involving two factors such 
that the levels of one factor occur only within the levels of another factor. Here, 
the levels of a given factor are all different across the levels of the other factor. 
More specifically, given two factors A and B, the levels of B are said to be nested 
within the levels of A, or more briefly B is nested within A, if every level of 
B appears with only a single level of A in the observations. This means that if 
the factor A has a levels, then the levels of B fall into a sets of b), bo, ..., ba 
levels, respectively, such that the i-th set appears with the 7-th level of A. These 
designs are commonly known as nested or hierarchical designs where the levels 
of factor B are nested within the levels of factor A. 

For example, suppose an industrial firm procures a certain liquid chemical 
from three different locations. The firm wishes to investigate if the strength of 
the chemical is the same from each location. There are four barrels of chemi- 
cals available from each location and three measurements of strength are to be 
made from each barrel. The physical layout can be schematically represented 
as in Figure 6.1. This is a two-way nested or hierarchical design, with barrels 
nested within locations. In the first instance, one may ask why the two factors, 
locations and barrels, are not crossed. If the factors were crossed, then barrel 1 
would always refer to the same barrel, barrel 2 would always refer to the same 
barrel, and so on. In this example, this is clearly not the situation since the 
barrels from each location are unique for that particular location. Thus, barrel 
1 from location I has no relation to barrel 2 from any other location, and so 
on. To emphasize the point that barrels from each location are different barrels, 
we may recode the barrels as 1, 2, 3, and 4 from location I; 5, 6, 7, and 8 from 
location II; and 9, 10, 11, and 12 from location III. For another example, suppose 
that in order to study a certain characteristic of a product, samples of size 3 
are taken from each of four spindles within each of three machines. Here, each 


H. Sahai et al., The Analysis of Variance 347 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


348 The Analysis of Variance 


I II Il 
Locations ¢€ -~ X ) 


Barrels Pail F2) 23} tal cal fal pap capo cal ba) bal ba 
Measurements 


FIGURE 6.1. A Layout for the Two-Way Nested Design Where Barrels Are Nested 
within Locations. 


I Ill 
Machines G _- G ) 


Spindles Oy 22} B} 4] DI 2} BI ay L2] by ie 
Samples 


FIGURE6.2 A Layout for the Two-Way Nested Design Where Spindles Are Nested 
within Machines. 


separate spindle appears within a single machine and thus spindles are nested 
within machines. Again, the layout may be depicted as shown in Figure 6.2. 
Both nested and crossed factors can occur in an experimental design. When 
each of the factors in an experiment is progressively nested within the preced- 
ing factor, it is called a completely or hierarchically nested design. Such nested 
designs are common in many fields of study and are particularly popular in sur- 
veys and industrial experiments similar to the ones previously described. In this 


Two-Way Nested (Hierarchical) Classification 349 


TABLE 6.1 
Data for a Two-Way Nested Classification 
Ay Az Aa 
By By2 eee Bip B>, B>> one Bop eee Bat B42 ee B.b 
Yilt yl2t ott: tb y21t 221 ++: )~)=6y2bt a Yali Ya2t "°° Yabl 
Mit2 »=-Y122—* Vb? y212 «=+222,—*** =~ Y2b2 ake Yat2 Ya22 °°: # Yab2 
Yitk Yi2k °** = Yibk y21k = =y22k 0+ ~~ Y2bk reat Yaitk Ya2k °°:  Yabk 


Yltn Yi2n <°°* Yibn Y2in  Y22n °°* = Y2bn mae Yatn Ya2n °*** Yabn 


and the following chapter, we consider only the hierarchically nested designs. 
The experiments involving both nested and crossed factors are considered in 
Chapter 8. 


Remarks: (i) To distinguish the crossed-classification from the nested-classification, 
we can Say that factors are crossed if neither is nested within the other. 

(ii) When uncertain as to whether a factor is crossed or nested, try to renumber the 
levels of each factor. If the levels of the factor can be renumbered arbitrarily, then the 
factor is considered nested. 

(iii) The nesting of factors can also occur in an experiment when the procedure restricts 
the randomization of factor-level combinations. 

(iv) The experiments involving one-way classification can be thought of as one-way 
nested classification, where a factor corresponding to “replications” is nested within the 
main treatment factor of the experiment. 


6.1 MATHEMATICAL MODEL 


Consider two factors A and B having a and b levels, respectively. Let the b levels 
of factor B be nested under each level of factor A and let there be n replicates at 
each level of factor B. Let y,;, be the observed score corresponding to the i-th 
level of factor A, j-th level of factor B nested within the i-th level of factor A, 
and the k-th replicate within the j-th level of B. The data involving the total of 
N =axbxnscores y;;x’s can then be schematically represented as in Table 6.1. 

The analysis of variance model for this type of experimental layout is taken as 


| ee ere 
Yijk = U+ 0; + Byiy + xij) = ee eee 2 (6.1.1) 
1,2 


350 The Analysis of Variance 


where —oo < jt < 00 is the overall mean, a; is the effect due to the i-th level 
of factor A, Bjq) is the effect due to the j-th level of factor B nested within 
the i-th level of factor A, and e,,;;) is the error term that takes into account the 
random variation within a particular cell. The subscript j(i) means that the j-th 
level of B is nested within the i-th level of A. Note that in the nested model 
(6.1.1) the main effects for factor B are missing since the different levels of 
factor B are not the same for different levels of factor A. Furthermore, note that 
the model has no interaction term between A and B since every level of factor 
B does not appear with every level of factor A. 


6.2 ASSUMPTIONS OF THE MODEL 


For all models given by (6.1.1), it is assumed that the errors eg(j)’S are un- 
correlated and randomly distributed with mean zero and variance o2. Other 
assumptions depend on whether the levels of A and B are fixed or random. If 
both factors A and B are fixed, we assume that 


a b 
Sia; = >> By = 0 (is 2 oe). 
i=] j=l 


That is, the A factor effects sum to zero and the B factor effects sum to zero 
within each level of A. Alternatively, if both A and B are random, then we 
assume that the @;’s and 8;(;)’s are mutually and completely uncorrelated and 
are randomly distributed with mean zero and variances o7 and Op» respec- 
tively. Mixed models with A fixed and B random or A random and B fixed 
are also widely used and will have analogous assumptions. For example, if 
we assume that A is fixed and B is random, then fj,)’s are uncorrelated and 
randomly distributed with mean zero and variance OF subject to the restriction 
that )>7_, a; = 0. 


6.3. ANALYSIS OF VARIANCE 


Starting with the identity 


Yijk — Y... = O.. — YW) + iy. — Yi.) + Oije — Siz.) (6.3.1) 


squaring both sides, and summing over i, j, and k, the total sum of squares can 
be partitioned as 


b 
Yous, y = bn 6: =). Pen Ou - yi.) 
1 j=1 k= i=l j= 

b on 


+ S > Y_“Onie — vi)’, (6.3.2) 


i=1 j=1 k=1 


a 


i= 


Two-Way Nested (Hierarchical) Classification 351 


with usual notations of dots and bars. The relation (6.3.2) 1s valid since the 
cross-product terms are equal to zero. The identity (6.3.2) states that the total 
sum of squares can be partitioned into the following components: 


(i) asum of squares due to factor A, 
(11) a sum of squares due to factor B within the levels of A, and 
(111) a sum of squares due to the residual or error. 


As before, the equation (6.3.2) may be written symbolically as 
SSr = SS, + SSacay + SSe. (6.3.3) 


There are abn — | degrees of freedom for SS;, a — 1 degrees of freedom for 
SS, a(b — 1) degrees of freedom for SS g:4), and ab(n — 1) degrees of freedom 
for the error. Note that (a — 1) + a(b — 1) + ab(n — 1) = abn — 1. The mean 
squares obtained by dividing each sum of squares on the right side of (6.3.3) are 
denoted by MS,4, MSa,a), and MSz, respectively. The expected mean squares 
can be derived as before. The traditional analysis of variance summarizing the 
results of partitioning of the total sum of squares and degrees of freedom and 
that of the expected mean squares is shown in Table 6.2. 


6.4 TESTS OF HYPOTHESES: THE ANALYSIS 
OF VARIANCE F TESTS 


Under the assumption of normality, the mean squares, MS,, MSa,a), and MSz 
are independently distributed chi-square variables such that the ratio of any two 
mean squares is distributed as the variance ratio F. Table 6.2 suggests that if 
the levels of A and B are fixed, then the hypotheses 

Hj :alla; =O versus Hy‘: alla; 40 
and 


Hy all Bic) =0Q versus H? : all Bic) a 0 
can be tested by the ratios 


Fa = MS,4/MSeE 
and 


Fg = MSp,ay/MSe, 


respectively. It can be readily shown that under the null hypotheses H}* and 
H;’, we have 


F, ~ Fla —1,ab(n — 1)] 


The Analysis of Variance 


352 


70 7 70 70 
[=f |=! [=f =! 
(1 — 9)P — q)D 
q ov q =D 
IF} 1p = {229 
n a 1 _ d qa | , = 
7ouq + 70 ae a + fou + Zo 7ouq + fou + 70 ee a + 70 
poxiy g ‘wopuey y wopuey g ‘paxly y wopuey g ‘wopuey Y paxiy g ‘paxi4 y 
Hl |PPOW Il |PPOW I |}PPOW 


aaenbs ueaw pa}dadxq 


“SS | — uqo [210], 

ASW ISS (I — u)qv 10g 
WaSW WSs (I—9)o V UlUIM g 
YSW YSS [-—0 V oy and 
aaenbs = sauenbs wopoal4 UOIJELLA 


uraw young jo saaisdaq JO 391Nn0S 


(L°L°9) JaPOW 40} BdUBLURA Jo SIsAjeUuY 
C9 J1aVLl 


Two-Way Nested (Hierarchical) Classification 353 


and 
Fz ~ Fla(b — 1), ab(n — 1)]. 


Thus, H¢' and H,’ are tested using the test statistics F4 and Fg respectively. 
Similarly, if both A and B are random factors, we test! 


a Avsc 32 
Hj :0, =90 versus H; :a; > 0 


by 
Fa = MS,4/MSa a) 
and 
Hy [OB =0 versus H, sor > 0 
by 


Fg = MSava)/MSeE. 

Finally, if A is a fixed factor and B is random, then 
Hp :alla; =O versus H;} calla; 40 

is tested by 

Fx = MS,/MSa,a) 
and 

Hy Op =0 versus H, [0% > 0 

is tested by 


Fx = MSacay/MSe. 


Remark: If one does not reject the null hypothesis Hy’ :oj = 0, then MS g4) might be 
considered to estimate the same population variance as does MS¢z. Thus, as remarked 
earlier in Section 4.6, some authors recommend computing a pooled mean square by 


! For some results on approximate tests for other hypotheses concerning a2 and o2, involving a 
non zero value and one or two-sided alternatives, see Hartung and Voet (1987) and Burdick and 
Graybill (1992, pp. 82-83). 


354 The Analysis of Variance 


pooling the sums of squares SSa,4) and SS- and the corresponding degrees of freedom 
a(b — 1) and ab(n — 1) in Table 6.2. This will theoretically provide a more powerful 
test for differences due to levels of factor A. However, as indicated before, there is no 
widespread agreement on this matter and care should be exercised in resorting to pooling 
procedure. 


6.5 POINT ESTIMATION 


In this section, we present results on point estimation for parameters of interest 
under fixed, random, and mixed effects models. 


MODEL I (FIXED EFFECTS) 


If both factors A and B are fixed, the parameters jz, a;’s, and Bj)’s may be 
estimated by the least squares procedure by minimizing the quantity 


a b n 
O=) 0d) > Oi — He - % — Bw (6.5.1) 


i=) j=l k=] 


with respect to yz, a, and B;(;); and subject to the restrictions: 


a b 
wa >) Bea Cad aw), (6.5.2) 
i=] j=l 


It can be readily shown by the methods of elementary calculus that the resulting 
solutions are: 


=. (6.5.3) 
6; =3;.-j¥., i=1,2,...,a (6.5.4) 

and 
Bia = 9. - Fi. §=1,2,...,0; f=1,2,...,0. (6.5.5) 


It should be observed that the estimators (6.5.3) through (6.5.5) have consider- 
able intuitive appeal; the A treatment effects are estimated by the average of all 
observations under each level of A minus the grand mean, and the B treatment 
effects within each level of A are estimated by the corresponding cell average 
minus the average under the level of A. 

The estimators (6.5.3) through (6.5.5) have variances given by 


Var( ji) = 0; /abn, (6.5.6) 
Var(&;) = (a — 1)o7 /abn, (6.5.7) 


Two-Way Nested (Hierarchical) Classification 355 


and 
Var(B ji) = (b — 1)0} /bn. (6.5.8) 


The other parameters of interest may include: jz +a; (means of factor level A), 
pairwise differences a; — aj’, B ji) — Bj), and the contrasts of the type 


a a b b 
So bia (» £; = o and > &Biiy (> t' = o) 
i=] i=l j=! j=] 


Their respective estimates along with the variances are: 


+a; = ji, Var( ju + a) = af /bn; (6.5.9) 
QQ - ay = Yj.. a Vies Var(a; = Q;’) = 20, /bn; (6.5.10) 
Bia) — Bray = Vij. — Yaz, — Var( Bia — By) = 20,7 /n; (6.5.11) 


Sia, = wae ver( > 2a) = 5 a) o; /bn; (6.5.12) 
[=] i=] i=] i=] 


and 


YE Bia ie vee) Gr = (>: «) oy/n. (6.5.13) 
j=l j=l j=l 


j=l 


The best quadratic unbiased estimator of o? is, of course, provided by the error 
mean square. 


MODEL II (RANDOM EFFECTS) 


When both factors A and B are random, the analysis of variance method can be 
used to estimate the variance components 07, 03, and o7. From the expected 
mean squares column under Model II of Table 6.2, we obtain 


5; = MSz, (6.5.14) 

65 = (MSga) — MSz)/n, (6.5.15) 
and 

32 = (MS4 — MSagya)/bn. (6.5.16) 


The optimal properties of the analysis of variance estimators discussed in Sec- 
tion 2.9 also apply here. However, again 63 and 6{ can produce negative 


356 The Analysis of Variance 


estimates.? A negative estimate may provide an indication that the correspond- 
ing variance component may be zero. One then might want to replace any 
negative estimates by zero, pool the adjacent mean squares, and subtract the 
pooled mean square from the next higher mean square for estimating the corre- 
sponding variance component. Finally, we may note that a minimum variance 
unbiased estimator for op /a2 is given by 


|e (1 Sea ) _ | (6.5.17) 
n| MSe ab(n — 1) 7 


MODEL III (MIXED EFFECTS) 


Many applications of this design involve a mixed model with the main factor 
A fixed and the nested factor B random. For a mixed model situation, the fixed 
effects a;’s are estimated by 


OS Vi ys 2 eee 2 


and the variance components a? and o? can be estimated by eliminating the line 
corresponding to A from the mean square column of Table 6.2 and applying 
the analysis of variance method to the next two lines. Table 6.3 summarizes the 
results on point estimates of some common parameters of interest. 


6.6 INTERVAL ESTIMATION 


In this section, we summarize results on confidence intervals for parameters of 
interest under fixed, random, and mixed effects models. 


MODEL | (FIXED EFFECTS) 


An exact confidence interval for «7 can be based on the chi-square distribution 
of ab(n — 1)MSg/o2. Thus, a 1 — @ level confidence interval for a is 
b(n — 1)MS b(n — 1)MS 
ee (6.6.1) 
x*[ab(n — 1), 1 — @/2] x?[ab(n — 1), a/2} 


Furthermore, confidence intervals based on the ¢ distribution for a particular 
treatment or factor level mean jz +a; or a; can be readily obtained. For example, 


_ MSe 
y;,. — tlab(n — 1), 1 —a/2] ; <pP+ Qj 
n 


MSE 
bn 


< yi. + tlab(n — 1), 1 — @/2] 


2 For a discussion of the maximum likelihood and other nonnegative estimation procedures and 
their properties, the reader is referred to Sahai (1974b, 1976). 


Two-Way Nested (Hierarchical) Classification 357 


TABLE 6.3 
Point Estimates and Their Variances under Model Il 
Parameter Point Estimate Variance of Estimate 
yu. ... (o2 + nop) / abn 
pL + a; Yi. (02 + nop) [bn 
Qj Vie = Vas (a — (02 + nop) /abn 
Bie saey. ae oes. 2( 2 ay 
i i Mia JU x Oo, + nog bn 
a a a a 
> £50; (> te o > a5. (: a) (02 + nop) [bn 
(= i=l i=l =] 
a? MS¢ 203 [ab(n - 1) 
2 
a2 + nop MSp 2(02 + nop) [ab — 1) 
2 
; 4 (02 + nop) a 
MSava) — MS ay (eae neat ee meee eee 
op (MSz,a) E)/n -2 abo) GED 


and 


Oe am eae a/2)/ @— Se < Oj 
abn 
< (Fi, —5..) + tlab(n — 1), 1— a2}, @— MB 
abn 


give exact 1 — a@ level confidence intervals for jz + a; and a;, respectively. 
Similarly, to obtain confidence limits for a; — a;, we note from (6.5.10) that 
E (yi. — Vir.) = Oj — ay 
and 


Var(i.. — Yr.) = 207 /bn. 
The confidence limits can, therefore, be derived from the relation 


(Vi. — Vir.) — (Qi — air) 
JIMS; /bn 


Similar results on confidence intervals for Bj(i)’s and any pairwise differences 
on them, using the ¢ distribution, can also be obtained. However, multiple 
comparison methods, discussed in Section 6.9, are usually preferable. 


~ tlab(n — 1)]. (6.6.2) 


358 The Analysis of Variance 


MODEL II (RANDOM EFFECTS) 


An exact confidence interval for the error variance component @? is of course 
obtained as in (6.6.1). However, exact confidence intervals for the variance com- 
ponents a2 and a2 do not exist. One can, nevertheless, construct exact intervals 
for of/07,0;/(0; + of), and o;/(o; + of) by using results on the sampling 
distribution of the ratio of two mean squares. In particular, the probability is 
1 — @ that the interval 


(Ae nee eee ree 7 
n\ MSz  Fla(b—1), ab(n — 1);1—@/2] 


1 Ge a re 1) | (6.6.1) 
n\ MSz _ Fila(b —1), ab(n — 1);a@/2] : 


captures 07/07. Exact intervals ono} /(o; + oj) anda; /(o; +03) are obtained 
from (6.6. 3) using appropriate transformations. 3 Similarly, ron) a companion 
of two confidence intervals, one for o2 and the other for o2 + nop: one can 
obtain a conservative 100(1 — @) percent confidence interval for oF as 


0 2 |: a(b — 1)MSzva) ab(n — 1)MSgz 
<0, < —-| =——_— - ee 
[a(b — 1), @/2] [ab(n — 1), 1 —a@/2] 


For the random effects model, one sometimes may also want to determine a 
confidence interval for jz. To obtain confidence limits for jz, we note that 


E(y...) = pu 
and 


7 a; +noz + bno; 
0 


Now, Var(y...) can be estimated by MS, /abn, and it follows that 


os ae tla — 1). 
 MS,/abn 

The confidence limits for jz can now be determined using the standard normal 
theory. 


3 For some results on confidence intervals for the variance components o? and of, total variance 
a2 +624 o2, the ratio of variance components a2 ee proportions of variability a? /(o2 + 
og), 0g) (5; +02), of (of +o +02), of /(07 +05 +0;), and og /(of +05 +0,), see Burdick 
and Graybill (1992, pp. 80-90). 


Two-Way Nested (Hierarchical) Classification 359 


MODEL III (MIXED EFFECTS) 


An exact confidence interval for oa? is of course given as in (6.6.1). An exact 
confidence interval for 02, however, does not exist. One can obtain an exact 
interval for Of /o2 by using a procedure based on the statistic MSp,4)/MSz. 
Also, as in the case of Model I, it is possible to obtain confidence intervals for 
[L, @;, L+Q;, &; —a;, the contrast }°"_, £;a;()_;_, 2; = 0), or any linear com- 
bination of the means ey £; (uw + a;), where the £;’s are any set of constants. 
Thus, for example, exact 100(1 — a) percent confidence intervals for + a; 
and )>;_, £; (uw + a;) are given by 


MS a,a) 
bn 


yi. £ tla(b — 1), 1 — @/2] 


and 


> 45. = tla — 1), 1 — o/2] 
i=] 


respectively. However, again, multiple comparison methods discussed in Sec- 
tion 6.9 are to be preferred. 


6.7 COMPUTATIONAL FORMULAE AND PROCEDURE 


The computational formulae for the sums of squares may be obtained by ex- 
panding the corresponding definitional formulae and simplifying the algebra. 
They are: 

2 


a bi oon 
S87 = OY 9h — 


i=1 j=l k=1 


and 


Note that SS 3,4) can be written as 
a 


en 1 
SS Ba) = FE yy — mt | 
2 n d Jy obn 


i=] 


360 The Analysis of Variance 


This shows the idea that SSg,,4) is the sum of squares between the levels of B 
for each level of A which is summed over all levels of A. 


6.8 POWER OF THE ANALYSIS OF VARIANCE F TESTS 


For the fixed effects or Model I, we can calculate the power of each of the F tests 
given in Section 6.4 in the usual way. Thus, when B effects are investigated, 
we have 


a b Fe 
i“ se ». Bia) 
] i=] j=] 


PBA) = o.| a(b—1)+1 


Similarly, for A effects, we have 
/2 


r 1 
bn ) a? 
i=] 


a 


‘ _. 
Ge 


and so on. Except for this change, the power calculations remain unchanged. 
Power formulae under Models II and III can similarly be obtained. 


6.9 MULTIPLE COMPARISON METHODS 


When the factor A effect is fixed and the null hypothesis concerning A effects 
is rejected, we may want to use Tukey, Scheffé, or other methods to investigate 
contrasts of interest. For example, if H;' is rejected, we may want to investigate 
contrasts involving a@;’s of the form 


L= Stoo, (yo -0) 
i=l i=] 


which is estimated by 
L= 3S €; Yi... 
i=] 


Now, if B is also fixed (i.e., we have a Model I situation), then 


a A MS = 
Var(i) = = we 
f=1 


Two-Way Nested (Hierarchical) Classification 361 


Further, for the Tukey’s procedure involving pairwise differences, we have 
T = gla, ab(n — 1);1-— a]. 
For the Scheffé’s procedure involving general contrasts, we would have 
S? = Fla — 1, ab(n — 1);1 —a’]. 


In particular, for any m pairwise comparisons of the type a; — a@;’, its (1 — @)- 
level Bonferroni intervals can be obtained as 


2MS 
i.. — Fw.) # tab — 1), 1 — @/2m))/ a, 
nh 


Furthermore, since there are only a — 1 independent comparisons of this form, 
the (1 — a)-level Scheffé’s confidence intervals are given by 


= 


Oi vn) { — 1)Fla —1,ab(n — 1);1—-a] 
bn 


Similarly, when Hy is rejected, similar simultaneous intervals for any con- 
trast on Bj(j)’S or any pairwise differences on them can also be obtained. Thus, 
for any fixed 7, the (1 — a)-level Bonferroni simultaneous confidence intervals 


for Bj) — By) are given by 


2MS 
(ij. — Yi.) £ tlab(n — 1), @/2m},] z = 


The (1 — a@)-level Scheffé’s simultaneous confidence intervals are given by 


- - 2MS« )2 
(Vij. — Viv) & 4 (b — 1) F[b — 1, ab(n — 1);1 - @] . 


For the case when B is random (1.e., we have a mixed effects or Model III 
situation), 


Var(L) = ——@ Y°2?, 
ar(L) = —— e 
T = gla, a(b — 1);1—-a], 


and 


S* = Fla — 1, a(b — 1);1—a]. 


For making pairwise comparisons, Bonferroni intervals are given by 


2MS 
(¥;. — 3v.) £tla(b — 1),1- w/2m),/ ae daly 
nh 


362 The Analysis of Variance 


which gives an overall level of at least 1 — @ where m is the number of intervals 
constructed. 


6.10 UNEQUAL NUMBERS IN THE SUBCLASSES 


In experiments involving a nested classification, it is important to try to keep the 
sizes of nested factors (subsamples) equal. When the sizes of the subsamples 
are unequal, the analysis of variance and the expressions for the expected mean 
squares in the analysis of variance table become quite complicated. In this 
section, we indicate briefly an analysis of variance when there are unequal 
numbers in the subclasses.* 

Define the following notations: 


a = number of levels of factor A; 
b; = number of levels of factor B within the i-th level of factor A; 
nj;j = number of replications from the j-th level of factor B within 
the i-th level of factor A; 
nj. = ye n;; = number of observations at the i-th level of factor A; and 
NS np = 4 ie , Ni; = total number of observations in the 
experiment. 


Now, the sums of squares can be defined by an analogy from the corresponding 
balanced case. They are: 


i Nij bj Nij 2 
Sie) Ge y= Sv 
i=l g=1. k=1 i=l j=1 k=1 
a 2 2 
$84 = Domb -y y= Pome a 
bj a b; y2. a y2 
SSa(a) = 3 Si ii. —¥.)= > > ai = > a 
i=1 j=l i=1 j=1 4 i=1 0! 
and 
a bh Nj a Nij a bi y2. 
SSe => >>) Ou — Fu ss fk 7 — 
i=l j=l k=l i=l j=l. k=1 i=l j=l 4 
where 


njj 


ij. = y Yijk» Vij. = Yij./Mij, 
k=l 


4 For the design with unequal numbers in the subclasses there is no unique analysis of variance. 
The conventional analysis of variance being presented here is based on quadratics commonly 
known as Type I sums of squares. 


Two-Way Nested (Hierarchical) Classification 363 


bi Nij 
yi. = So vies Yi. = Vi. /Ni., 
j=l k=!1 
and 
a bh Nij 
y= Yijks Y.. = Y/N. 
i=1 j=l k=1 


The derivations for expected mean squares can be found in Scheffé (1959, pp. 
255-258) and Graybill (1961, pp. 354-357). The resultant analysis of variance 
table is shown in Table 6.4. The coefficients of variance components in the 
expected mean square column are determined as follows: 


be 2 a bi 2 a b 2 

a 

j=l ni y= ie ae Dye nij 
nj. N 

Ay = SS, 2 = —_ 


yb - 0 aie 


and 


TESTS OF HYPOTHESES 


Under Model I, the null hypothesis of interest, Hg: Byiy= 
Bai) = --- = B,,@) = 0, subject to the constraint that ae nij Pia) = 9,1 = 
1,2,...a, can be tested by the statistic 


Fy = MiSBw 

MS=z 
which, when H? is true, has an F distribution with )>7_, bi; -a and N—)77_, b; 
degrees of freedom. Similarly, one may be interested in testing whether the 
effects at each level of the factor A are the same, 1.e., H3 [Q) =p = +++ =A. 
The hypothesis H;', however, cannot be tested. It can be shown that the statistic 


_ MS, 


F’ — > 
4 MS; 


where Fy has an F distribution with a— 1 and N —}\;_, b; degrees of freedom, 
tests the hypothesis Hp: a; = 0,i = 1,2,...,a, subject to the constraints that 
y=) Nia; = O and an nijBji) = 0,i = 1,2,...,a. For some further 
discussion and derivation of the results, see Searle (1987, Chapter ITI). 


The Analysis of Variance 


364 


‘9 = f0'!u "°°? st IT] [OPOYY JOPUN JUTeIISUOD OPIS OY 4» 


pie pa rgs= Olgty rk pue Q = Jo!lu!™’¢ sage | [apoyy JOpuN sjuTeNsuOd opis BY] ,. 


LSS [=n [2101 
|=! 
e eo 79 TSW Iss 'q€-N Jong 
D 
=! 
— ak 1=! 
folu+ 20 polu+ zo ————-— +70 Wasw Wass  p— iq € yu g 
OlVhg 7 OL D 
cd fu << 
'q D 
pene + Poly + 20 Potu + Joty 70 Pa areca oe 2G 4 Y _o 0} on 
(ae. eg aaa acl l= ' 2 SW SS I V d 
ae geek 
y D 


wopuey g ‘paxijp wopury g ‘wopueyy paxil g ‘paxl4 V asenbs sasenbs wopaa.4 UOIELeA 


«al lL JIPOW i JPPOW «| JPPOW urow jouing jo $93139G jO 9DANOS 


aaenbs ueaw pa}dadx 


SISSE|IQNS Jy} Ul SAQqUINA, yenbauy SUIAJOAU] (1° 1L°9) |APOW 404 BDURLIeA JO sisAyeuY 
v9 d1EaV1 


Two-Way Nested (Hierarchical) Classification 365 


Under Models II and III, it is evident that there are no simple F tests for the 
hypotheses relating factor A effects. This is because under the null hypothesis 
we do not have two mean squares in the analysis of variance table that estimate 
the same quantity. All three mean squares happen to estimate different quan- 
tities. In addition, although MS, is distributed as constant times a chi-square 
random variable, MS a4) and MS, do not in general have a scaled chi-square 
distribution. Furthermore, MS g,4) and MS, are not statistically independent. 
Approximate F tests can be developed using the Satterthwaite procedure dis- 
cussed in Appendix K.° However, a test of the hypothesis: 


He [OB =0 versus H; Of > 0 
can be carried out by the ratio 
MS a,4)/MSe, 


which has an F distribution with }~7_, b; — a and N — }°\_, b; degrees of 
freedom. 


POINT AND INTERVAL ESTIMATION 


Under Model I, when H,’ is rejected, it is often of interest to construct a 
simultaneous confidence interval for Bj) — Bj), i = 1, 2,..., a. Fora fixed, 
the 1 —@ level Bonferroni simultaneous confidence intervals for m independent 
comparisons of the form 6 j:) — Bj) are given by 


(Vij. — Yi Dat) N — yo bi. 1—a/2m 


I=] 


Furthermore, for any given i, since there are only b; — 1 independent compar- 
isons of this form, the 1 — @ level Scheffé’s simultaneous confidence intervals 
are given by 


aor é 1 1 : 
ij. Yi J) (Oi — DF 1b; —1,N — > obi 1 —a}{—+—]MSe;e . 
= Nij Nj j' 


> Some authors have ignored the unbalanced structure of the design and have used the conventional 
F test based on the statistic MS, /MSa,,) with a — | and are b; — a degrees of freedom (see, 
e.g., Bliss (1976, p. 353)). A common procedure is to ignore the assumption of independence 
and chi-squaredness and construct an approximate F test using synthesis of mean squares based 
on the Satterthwaite procedure. For some further discussions and results on tests of hypotheses 
concerning variance components involving unequal sample sizes, see Cummings and Gaylor 
(1974), Tietjen (1974), Hussein and Milliken (1978b), Tan and Cheng (1984), Khuri (1987), and 
Hernandez et al. (1992). | 


366 The Analysis of Variance 


Under Models II and III, if expected mean squares are equated to the corre- 
sponding mean squares in Table 6.4 and the resulting equations are solved for 
the variance components, these are the so-called analysis of variance estimates 
and are unbiased (Searle (1961)). For example, under Model II, the estimates 
of the variance components are® 


6°? =MSz, 

a ] 

6g = —(MSaa) — MSz), 
1 


and 


et 
6? = {MS — MSaia)) rie GS sie Mss) 
n3 


The expression (7; — z)//, 1s usually negligible, in which case the expression 
for Ge reduces to (MS, — MSa,a))/n3. The quantities 7; and 2 can be thought 
of as kinds of averages of the numbers of observations in the subgroups (n;;) 
and they both reduce ton when bj = band n,; = n. Similarly, 73 can be thought 
of as an average of the numbers of observations corresponding to the levels of 
factor A and it reduces to bn when b; = b and nj; =n. 

Under Model II, in terms of the results on confidence intervals for the variance 
components, an exact confidence interval for a2 can be obtained by noting that 
the statistic (N — }“7_, b;)MS,/o?2 has a chi-square distribution with N — 
>-;, 5; degrees of freedom. However, exact confidence intervals for o2 and 
og do not exist. A conservative 1 — @ level confidence interval for og can be 
obtained as 


Remark: Approximate confidence intervals for o2 and OR have been proposed by Her- 
nandez et al. (1992). Similarly, the problem of constructing exact confidence intervals 
on the ratios of variance components o2/07 and of /0; has been discussed by Seely and 
El-Bassiouni (1983) and Verdooren (1988). In addition, Burdick and Graybill (1985) 
and Hemange? ane pu (1993) have considered confidence intervals for the total 
variance G. + O5 + a2 , and Burdick et al. (1986a) and Sen et al. ee have BeNEIODeS 
confidence intervals on the proportions of variability o7/(o; +03 +02), o3/(0; +95 24 

a2), and of /(o;7 + of + o2). For a concise aiceuseion of thieee and ier eile. on 


© The analysis of variance estimators for the unbalanced classification do not lead to the same 
estimates as the maximum likelihood estimators. The maximum likelihood equations are difficult 
to solve in unbalanced classifications. For some results on other estimation procedures, see Searle 
(1971b, pp. 475-477). 


Two-Way Nested (Hierarchical) Classification 367 


confidence intervals for variance components including numerical examples, see Bur- 
dick and Graybill (1992, pp. 98-109). 


Under Model III, we are also in a difficult situation when we want to determine 
confidence intervals for the general mean jz, factor A level means yz + @;, fixed 


effects w;, pairwise differences a; —a;’, or the contrast ee £30; cae £; = 0). 
For example, under Model III, the variance of a factor A level mean is 


and the variance of the overall mean y__ 1s 


bj 
No? +- y\njjo5 


Var(5...) = = 
Furthermore, 
n; oO +) ni 3 nyo? + yon OB 
Var(yi.. — Yi.) = 2 + a 
and 
aye N—nj. 4 N —2nj, < a Lah 2). 
Var(yi.. — Y...) = “Wa a Cee ij eo) ms 


Comparing these variances with the expected mean square expressions in 
Table 6.4, we notice that there are no simple estimates of the variances. Ap- 
proximate methods involving Satterthwaite procedure can, however, be used to 
obtain the required confidence intervals. Burdick and Graybill (1992, pp. 170— 
171) give a numerical example illustrating methods of constructing confidence 
intervals for 07, 03, and 03/02. 


Remark: In designing an experiment involving subsamples, it is suggested that sub- 
samples of equal size preferably should be used. If during the course of the study, 
certain data are missing and the numbers in the various subsamples are unequal, then 
after some considerations of why the data are missing, the results can be analyzed by 
using the means of the subsamples. Such an analysis violates the assumptions of equal 
variance, but the variances will be approximately equal. A drawback in analyzing the 
cell means is that we no longer have the unbiased estimates of the variance components. 
Since F tests are relatively robust against heterogeneous variances, small differences 
in n;; would not affect the conclusions. An advantage of cell means procedure is that 


368 The Analysis of Variance 


TABLE 6.5 
Weight Gains of Chickens Placed on Four Feeding Treatments 
Treatments 
LoCaLoL LoCaHiL HiCaLoL HiCaHiL 
Pens 1 2 1 2 1 2 1 2 
Weight gains 573. 1,041 618 943 731 416 518 416 


613 636 659 734 =770 776 = 672 7716 
901 685 817 1,050 787 657 576 657 


Pen totals (yi) 4156 4564 4,414 4647 4,728 3,720 4,241 3,720 
Treatment totals (y; ) 8,720 9,061 8,448 7,961 


Source: Damon and Harvey (1987, p. 26). Used with permission. 


the values tend to be normally distributed in view of the central limit theorem. If, how- 
ever, the assumption of normality is not in question and we wish to estimate variance 
components, an alternative procedure is as follows. Missing values are replaced by the 
corresponding cell means and an analysis is carried out assuming an equal number of 
observations. The degrees of freedom corresponding to the residual or error mean square 
are decreased by one for each missing value. The consideration of why the values are 
missing should always be taken into account. It is appropriate to analyze the remaining 
data only if the loss of the missing data can be ascribed to have occurred by chance. The 
reader is referred to Yates (1934) for further discussion on this point. 


6.11 WORKED EXAMPLE FOR MODEL | 


Damon and Harvey (1987, p. 26) reported data on weight gains of chickens 
placed on four feeding treatments. The original data were supplied by Dr. 
Donald L. Anderson of the Department of Veterinary and Animal Sciences at 
the University of Massachusetts. The experiment involved the determination 
of weight gains (in grams) from 10 to 20 weeks of chickens placed on four 
feeding treatments obtained from combinations of high and low calcium and 
lysine. Weight determinations were made using six chickens in two pens from 
each of the four feeding treatments. The data are given in Table 6.5. 

The data in Table 6.5 can be regarded as forming a two-way nested clas- 
sification with pens nested within treatments and weight determinations made 
using six chickens from each pen. Here, treatments are fixed, and although pens 
are usually selected randomly, we analyze the data under the assumptions of 
a fixed effects model where both treatments and pens are considered to have 


Two-Way Nested (Hierarchical) Classification 369 


systematic effects. The mathematical model would be 
f= 2 

Vig = Rt + Bite +J=l2 

ea we 


where y; jx is the k-th observation in the j-th pen and the i-th treatment, ju 1s 
the general mean, a; is the effect of the i-th treatment, Bj() 1s the effect of 
the j-th pen nested within the i-th treatment, and e,,;;) 1s the customary error 
term associated with the k-th observation, nested within the j-th pen within the 
i-th treatment. Also, the @;’s and Bj()’s are assumed to be fixed effects with 


een a; = 0, are Bjay = 0,1 = 1,2,3,4, and exc j)’s are assumed to be 


independently and normally distributed with mean zero and variance o?. 


Using the computational formulae for the sums of squares given in Section 
6.6, we have 


( ) ( ) ~~ s ( ) 
4x2x6 


= 25,515,538 — 24,353,252.083 
= 1,162,286.917, 

(8,720)* + (9,601)? + (8,448)* + (7,961)* (34,190)? 
— OC OCOOKGO”~OO””CCOC“‘ “ ‘KCK 
= 24,407, 195.500 — 24,353,252.083 
= 53,943.417, 
(4,156)? + (4,564)? + ---+ (3,720) 
EY 

(8,720)* + (9,061)? + (8,448)? + (7,961) 
nny ho ie 
= 24,532,883.667 — 24,407, 195.500 


= 125,688.167, 


SS4 


SS pa) = 


and 


(4,156)? + (4,564)* + --- + (3,720)? 


SSp = (573) + (636)? +---+ (657)! ; 


= 25,515,538 — 24,532,883.667 
= 982,654.333. 


These results along with the remaining calculations are summarized in Table 6.6. 


The Analysis of Variance 


LI6' S87 Z9T'I LY TRIOL 
70 8S€996‘bZ ELE PS9'786 Or log 
j=! I=! (1-2 (sjusurjeoy 
p67'0 6L7'1 es Oe — > +22 = HOHE —_L9T'889‘SZI p uIyyIM) suad 
an 
|=! 
. * 1 l = 4 a . ‘ « ‘ 
6S 0 ctELO in ¢ OxZ + 72 6t 1 186 LI LIV CV6 CS t SJUSUCOT] 
p 
anjea-d anjeA 4 aaenbs ueaw aaenbs sauenbs wopaad4 udI}eLIeA 
payoadx] ueaw jo wing jo saaisaq yO 304N0S 


G°g ajqey Jo Eye SUIL JYSIIM JY} 40j BDURLILA Jo SIsAjeUY 
9°9 J1EVL 


370 


Two-Way Nested (Hierarchical) Classification 371 


The test of the hypothesis HS (4) - all Bj) = O versus fs hace all Bj) # O gives 
the variance ratio of 1.279 which is not significant (p = 0.294). Similarly, the 
test of the hypothesis H3 + alla; = O versus H;} : alla; 4 O gives the variance 
ratio of 0.732 which 1s also not significant (p = 0.539). Thus, there do not 
seem to be any significant differences in both treatment and pen effects. 


6.12 WORKED EXAMPLE FOR MODEL II 


Box et al. (1978, pp. 574-575) reported data from an experiment designed to 
estimate moisture content of the pigment paste. For this purpose, 15 batches 
of pigment paste used in the manufacture were randomly selected, each batch 
was independently sampled twice, and for each sample, two analyses were 
performed. The data are given in Table 6.7. 

The model for this experiment is a two-way nested classification with samples 
nested within batches and two analyses (subsamples) made from each sample. 
We will analyze the data under the assumptions of a random effects model 
since both batches and samples are considered to have variable effects. The 
mathematical model would be 


i=1,2 
Yijk =UM+Q;+ Bray ten 4 J=1,2 
k=1,2 


where y;;, 1s the k-th observation (analysis) in the j-th sample in the i-th batch, 
4 is the general mean, a; is the effect of the i-th batch, Bj(;) is the effect of 
the j-th sample nested within the i-th batch, and e,;;;) is the effect of the k-th 
subsample nested within the j-th sample within the i-th batch (error term). 
Furthermore, the a's, Bj(i)’s, and ex ;)’S are assumed to be independently and 
normally distributed with mean zero and variances 02, Of, and a7, respectively. 

Using the computational formulae for the sums of squares given in Section 
6.7, we have 


1,607) 

SS7 = (40)? + (39)? +--- + (28)? — 

Tr = (40) + G9) +--+ + (28) isx2%2 
= 45,149 — 43,040.817 
= 2,108.183, 

es _ (139° + (105)? + --- + (130)? (1,607)? 

— 2x2 15x2x2 

= 44,251.750 — 43,040.817 


= 1,210.933, 


(79)? + (60)? + --- + (54)? 7 (139)* + (105)* + --- + (130)? 


SS 
By) 2 x2 


372 


TABLE 6.7 


The Analysis of Variance 


Moisture Content from Two Analyses on Two Samples of 15 Batches of 


Pigment Paste 


Samples 


Analyses 


Analysis totals (yj; ) 
Sample totals (y; ) 


Samples 


Analyses 


Analysis totals (y;; ) 
Sample totals (y;_) 


Samples 


Analyses 


Analysis totals (y;; ) 
Sample totals (y; ) 


1 2 
40 30 
39 30 
79 60 
139 
6 
1 2 
33 26 
32 24 
65 50 
115 
11 
1 2 
25 25 
23 27 
48 52 
100 


2 
1 Z 
26 25 
28 26 
54 51 
105 
7 
1 2 
23 32 
24 33 
47 65 
112 
12 
1 Z 
29 31 
29 32 
58 63 


121 


Source: Box et al. (1978, pp. 574-575). Used with permission. 


= 45,121.500 — 45,251.750 
= 869.750, 


and 


SS, = (40)? + (39)? + --- + (28) — 


— 45,149 — 45,121.500 


= 27.500. 


Batches 
3 4 
1 2 1 2 
29 14 30 24 
28 15 31 24 
57 29 61 48 
86 109 
Batches 
8 9 
1 2 1 2 
34 29 27 31 
34 29 27 31 
68 58 54 62 
126 116 
Batches 
13 14 
1 2 1 2 
19 29 23 25 
20 30 24 25 
39 59 47 50 
98 97 
(79)? + (60)? + 
2 


20 
39 


37 
76 


130 


+++ (64) 


Two-Way Nested (Hierarchical) Classification 373 


TABLE 6.8 
Analysis of Variance for the Moisture Content of Pigment Paste Data 
of Table 6.7 


Source of Degrees of Sumof Mean Expected 

Variation Freedom Squares Square Mean Square _ F Value p-Value 

Batches 14 1,210.933 86.495 of +20,+40; 1.492 0.226 

Samples 15 869.750 57.983 of + 20% 63.231 <0.001 
(within batches) 

Error 30 27.500 0.917 o2 

Total 59 2, 108.183 


These results along with the remaining calculations are summarized in Table 6.8. 
The test of the hypothesis Hy” :0% = O versus H? PG, > 0 gives the 
variance ratio of 63.231 which is highly significant (p < 0.001). However, the 
test of the hypothesis Hj: 02 = 0 versus H/': 02 > O gives the F ratio of 1.49 


which is not significant (p = 0.226). Thus, we reject the first null hypothesis but 
not the latter. The point estimates of the variance components are obtained as 


6° = 0.917, 


1 
0% = 5 (57.983 — 0.917) = 28.533, 
and 


1 
oJ = 7 (86.495 — 57.983) = 7.128. 


These variance components account for 2.5, 78.0 and 19.5 percent of the total 
variation in the experimental data. The findings suggest that perhaps the largest 
single source of variability is the error arising in chemical sampling from the 
batches. The batch-to-batch variability also seems to be quite large although 
the results are not statistically significant. In order to estimate the mean of a 
batch, the estimated variance based on one subsample from one sample would 
be 


6; + 63 = 0.917 + 28.533 = 29.450. 


The estimated variance based on two subsamples from one sample would be 
62 /2+ 6% = 0.917/2 + 28.533 = 28.992. Thus, there is very little gain in the 
precision of the estimation of a batch mean by using two samples rather than 
one. The use of two samples may, however, be useful as a check against any 
major errors. 


374 The Analysis of Variance 


Finally, we can obtain confidence limits for the overall mean yz of the process 
as follows. We have 


= y.-= 26.783, 
oa? + nop + bno? 


Var(y_) = : 
(y..) Abn 

and 

ee MS 86.495 

VaiG. SS 0. 

abn 5x22 
Now, since 
y..— p 


= ~ t[a—1] and +[14,0.975] = 2.145, 
y Vary...) 


the 95 percent confidence limits for jz are obtained as 


26.783 + 2.1457 1.442 = (24.207, 29.359). 


6.13. WORKED EXAMPLE FOR MODEL II: UNEQUAL 
NUMBERS IN THE SUBCLASSES 


To give an example of Model II involving unequal sample sizes, consider the 
data in Table 6.9 taken from Graybill (1961, pp. 357—358). The data are artificial 
but the experiment is supposed to represent a breeding experiment where factor 
A is supposed to designate sires (a = 4) and factor B is supposed to designate 
dams (b,; = 3, b2 = 4, b3 = 2, b4 = 3) nested within sires. There are a total 
of N = 52 observations. The data in Table 6.9 can be regarded as forming a 
two-way nested classification with unequal sample sizes. Here, dams are nested 
within sires and sample determinations are made from each dam. Since both 
dams and sires are randomly selected, the data should be analyzed using a 
random effects model. The mathematical model would be 


ee ae rere 
Vijk = Ut; + By +enij) 1 J=1,2,..., 5; 
k Leet ere 
where y;;x 18 the k-th observation for the j-th dam in the i-th sire, yw is the 
general mean, a; is the effect of the i-th sire, Bj(;) is the effect of the j-th dam 
nested within the i-th sire, and e,,;;) is the effect of the k-th observation nested 
within the j-th dam within the i-th sire (error term). Furthermore, the a;’s, 
Bjiy’s, and ex@j)’S are assumed to be independently and normally distributed 
with mean zero and variances a. Op and a2, respectively. 
All the quantities needed for the analysis of variance computations out- 
lined in Section 6.10 can be readily computed on an electronic calculator. The 


Two-Way Nested (Hierarchical) Classification 375 


TABLE 6.9 
Data on Weight Gains from a Breeding Experiment 
Sires 
1 2 3 4 
Dams 1 2 3 142 3 4 #12 +14 2 =3 
Weight gains 32 30 «34 26 «222+2«CO238:s«o21:««16sd“4:s3ss 42s 26 


31 26 30 20 31 21 21 20 %18 34 43 25 
23 29 26 18 20 24 #30 32 16 41 40 29 


26 28 34 21 26 17 40 35 40 
18 32 18 2937 
3] 
26 
Dam totals (yi) 112 131 213 64 94 112 72 68 65 146 189 157 


No. in j-th dam (njj) 4 5 7 3 4 5 3 3 4 4 5 5 
(within i-th sire) 


Sire totals (y; ) 456 342 133 492 
No. in #th sire (n; ) 16 15 7 14 


Source: Graybill (1961, p. 358). Used with permission. 


results are: 


y? /N = (1,423)"/52 = 38,940.942, 


oS ei 40,861.201, 


Nij 


and 


a 2 


—~ = 40,610.885. 


Thus, 


SSr = 41,811 — 38,940.942 = 2,870.058, 
SS4 = 40,610.885 — 38,940.942 = 1,669.943, 
SSga) = 40,861.201 — 40,610.885 = 250.316, 


376 The Analysis of Variance 


and 


SSe = 41,811 — 40,861.201 = 949.799. 
The corresponding degrees of freedom are determined as 


Total: VN —1=52—1=5l, 
Sires:.a —1=4-1=3, 


Dams (within sires): ys b —a=12-4=8, 


i=] 


Error: N — bj — 52—12=40. 


i=l 


For approximate tests of significance, we need to evaluate the coefficients 
of the variance components in the expected mean squares and then determine 
the linear combination of mean squares to be used as the denominator of an 
approximate F statistic using Satterthwaite procedure. The basic quantities 
needed to determine the coefficients of the variance components are: 


a bj 
N= yoni = 52, 


i=l] j=1 
“.n; (16)? + (15)? + (7)* + (14) 


— = 13.9615, 
= N 52 

a Fin? 4)2 Ce eee 2 

OOO! co ciee 
=~ N 52 
i=l j=1 

and 

a bie y?. 4)2 5) 7)2 4)2 5) 5) 
ye OO pe CORTE = 17.8441, 
Nj, 


Now, the coefficients of the variance components in the expected mean square 
column are given by 


1 jai Mi. 52 — 17.8441 
fiy = ————__ = = 4.2695, 
‘ iA 


Y) (6: - 1) 


i=] 


Two-Way Nested (Hierarchical) Classification 377 


TABLE 6.10 
Analysis of Variance for the Weight Gains Data of Table 6.9 
Source of Degrees of Sum of Mean Expected 
Variation Freedom Squares Square Mean Square 
Sires 3 1,669.943 556.648 0? + 4.409407 + 12.679502 
Dams 8 250.316 31.290 of + 4.269505 
(within sires) 
Error 40 949.799 23.745 a? 
Total 51 2,870.058 
bi 2 bi 2 
yy yr 
_ jaja "~~ far jaa Ns 17.8441 — 4.6158 
nn? = = —_—_—————_ = 4.4094, 
a— 1 4-1] 
and 
eae 


iz) N 52 — 13.9615 


——_—_—_——_ = = 12.6795. 
a— | 4-1] 


The resulting sums of squares, mean squares, and expected mean squares are 
summarized in Table 6.10. 

The dams within sires can be tested directly against the error mean square, 
giving F = 31.290/23.745 = 1.318 (p =0.262). The results are clearly non- 
significant. An approximate F test for sire effects can be carried out using 
the dams within sires mean square, giving F = 556.648/31.290 = 17.790 (p < 
0.001), which is highly significant. However, to use Satterthwaite procedure, 
we first compute the coefficients as 


fy = N2/n, = 4.4094/4.2695 = 1.0328, £; = 1 — £l2 = —0.0328 
and the synthesized mean square is 
—0.0328(23.745) + 1.0328(31.290) = 31.537. 
The degrees of freedom for the synthesized mean square are 


(31.537) 7 
[—0.0328(23.745)]? _ [1.0328(31.290)]2 
a re 


/ 


The F ratio based on the synthesized mean square is F = 556.648/31.537 = 


378 The Analysis of Variance 


17.651 (p < 0.001), which gives essentially the same result as the earlier 
approximate test. 

The estimates of the variance components 07, Ops and o? are obtained as the 
solution to the following simultaneous equations: 


556.648 = 07 + 4.409405 + 12.6795a/, 
31.290 = of + 4.269505, 


and 
23.745 = 02. 


Therefore, the desired estimates are given by 


6? = 23.745, 
31.290 — 23.745 
a2 
= = 1.767, 
p 4.2695 


and 


aD. 556.648 — 23.745 — 4.4094(1.767) 


= 41.414. 
12.6795 


These variance components account for 35.5, 2.6, and 61.9 percent of the total 
variation in the experimental data. The results on variance components estimates 
are consistent with those on tests of hypothesis. It is further evident from this 
analysis that the larger part of the variability in weight gains is attributable to 
sires. The variability between repeated measurements on a given dam is also 
quite large. 


6.14 WORKED EXAMPLE FOR MODEL III 


Snedecor and Cochran (1989, p. 250) reported data from an experiment de- 
signed to evaluate the breeding value of a set of five sires in raising pigs. Each 
sire was mated to two dams randomly selected from a group of dams and aver- 
age daily weight gains of two pigs from each litter were recorded. The data are 
given in Table 6.11. 

The data in Table 6.11 can be regarded as a two-way nested classification 
with dams nested within sires and average daily gains made from two pigs of 
each litter. Here, sires are fixed and dams are random, so we have a Model III 
situation. The mathematical model for the experiment would be 


1=1,2 
Vijk =Mtat+ Buiytenijy 1 J=1,2 (6.14.1) 
ke 1.2 


Two-Way Nested (Hierarchical) Classification 379 


TABLE 6.11 
Average Daily Weight Gains of Two Pigs of Each Litter 
Sires 
1 2 3 4 5 
Dams 1 2 1 2 1 2 1 2 1 2 


Weight gains 2.77 2.58 2.28 3.01 2.36 2.72 2.87 2.31 2.74 2.50 
2.38 2.94 2.22 2.61 2.71 2.74 246 2.24 256 2.48 


Dam totals (yi) 5.15 5.52 450 5.62 5.07 546 5.33 4.55 5.30 4.98 
Sire totals (y; ) 10.67 10.12 10.53 9.88 10.28 


Source: Snedecor and Cochran (1989, p. 250). Used with permission. 


where jz is the general mean, a; is the effect of the i-th sire, Bj) is the ef- 
fect of the j-th dam nested within the i-th sire, and e,(;;) is the effect of the 
k-th observation nested within the j-th dam within the i-th sire (error term). 
Furthermore, the a@;’s are fixed with ~ a; =O, and Bjq)’s and ex(j)'S are 
assumed to be independently and normally distributed with mean zero and 
variances of and o;, respectively. 

To analyze the data of Table 6.11 according to model (6.1.1), the sums of 
squares using the computational formulae of Section 6.7 are obtained as 


51.48) 
SSr = (2.77) + (2.38)? +---+(2.48) — BShc 
Sx2 <2 
= 133.5598 — 132.5095 
= 1.0503, 
oe (10.67)? + (10.12)? + ---+ (10.28)? = (51.48)? 
aa 2x2 5x2x2 
= 132.6092 — 132.5095 
= 0.0997, 
(5.15)? + (5.52)? +--+» + (4.98) 
SS B(A) = SSE 
2 
(10.67)? + (10.12)* + --- + (10.28) 
2x2 


= 133.1728 — 132.6092 
= 0.5636, 


380 The Analysis of Variance 


TABLE 6.12 
Analysis of Variance for the Weight Gains Data of Table 6.11 
Source of Degrees of Sumof Mean Expected 
Variation Freedom Squares Square Mean Square FValue p-Value 
5 
Sires 4 0.0997 0.0249 o2+207+4) a7 0.221 0.916 
i=] 
Dams 5 0.5636 0.1127 of +203 2.912 0.071 
(within sires) 
Error 10 0.3870 0.0387 a? 
Total 19 1.0503 
and 


SSp = (2.77) + (2.38 +---+ (2.48) 


(5.15)* + (5.52)? + --- + (4.98) 
y, 
— 133.5598 — 133.1728 


= 0.3870. 


These results together with the remaining calculations are shown in Table 
6.12. The test of the hypothesis H?:0f =0 versus H,?: 0% > 0 gives the vari- 
ance ratio of 2.912 which is less than its 5 percent critical value of 3.33 
(p = 0.071). Similarly, the test of the hypothesis H¢': alla; =O versus Hj*: a; #4 
O for at least one i = 1, 2,...,5 gives the variance ratio of 0.221 which again 
falls substantially below its critical value of 5.19 at the 5 percent level (p = 
0.916). Thus, we may conclude that there is probably no significant effect of 
either sires or dams within sires on average daily weight gains in these data. 
The estimates of variance components o7 and Of are given by 


6° = 0.0387 


and 


1 
5 = 5 (0.1127 — 0.0387) = 0.037, 


The results on variance components estimates are consistent with those on tests 
of hypotheses given previously. 


Two-Way Nested (Hierarchical) Classification 381 


6.15 USE OF STATISTICAL COMPUTING PACKAGES 


For balanced nested designs involving only random factors, SAS NESTED is 
the procedure of choice. Although the NESTED procedure performs the F tests 
assuming a completely random model, the computations for sums of squares 
and mean squares remain equally valid under fixed and mixed model analy- 
sis. If some of the factors are crossed or any factor is fixed, PROC ANOVA 
is more appropriate for a fixed effects factorial model with balanced struc- 
ture while GLM is more suited for a random or mixed effects model involv- 
ing balanced or unbalanced data sets. In GLM, random and mixed model 
analyses can be handled via RANDOM and TEST options. For balanced de- 
signs, analysis of variance estimates of variance components are readily ob- 
tained from the output produced by either the NESTED or GLM procedure. 
For other methods of estimation of variance components, PROC MIXED or 
VARCOMP must be used. For instructions regarding SAS commands, see 
Section 11.1. 

Among the SPSS procedures either MANOVA or GLM could be used for 
nested designs involving fixed, random, or mixed effects models. InSPSS GLM, 
the random or mixed effects of analysis of variance is performed by aRANDOM 
subcommand and the hypothesis testing for each effect is automatically carried 
Out against the appropriate error term. In addition, GLM displays expected val- 
ues of all the mean squares which can be used to estimate variance components. 
Furthermore, SPSS Release 7.5 incorporates a new procedure, VARCOMP, es- 
pecially designed to estimate variance components. For instructions regarding 
SPSS commands, see Section 11.2. 

Among the BMDP programs, 3V or 8V can be used for nested designs. 8V is 
especially designed for balanced data sets while 3V analyzes a general mixed 
model including balanced or unbalanced designs. In 3V, the procedures for 
estimating variance components include the restricted maximum likelihood and 
the maximum likelihood estimators. If the estimates obtained via the analysis 
of variance approach are nonnegative, they agree with those obtained using the 
restricted maximum likelihood procedure. The program 2V does not directly 
give the sums of squares for nested factors. However, the cross-factor sums 
of squares could be combined to produce desired sums of squares in a nested 
design. 


6.16 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the application of statistical packages to perform 
two-way nested analysis of variance for the data sets of examples presented 
in Sections 6.11 through 6.14. Figures 6.3 through 6.6 illustrate the program 
instructions and the output results for analyzing data in Tables 6.5, 6.7, 6.9, and 
6.11 using SAS GLM/NESTED, SPSS MANOVA/GLM, and BMDP 3V/8V. 
The typical output provides the data format listed at the top, all cell means, and 


382 The Analysis of Variance 


| DATA CHICKENS; The SAS System 
INPUT TREATMEN PEN WEIGHT; General Linear Models Procedure 
DATALINES; 
1 1 573 Dependent Variable: WEIGHT 
11 1 636 Sum of Mean 
11 1 883 Source DF Squares Square F Value Pr > F 
Hive hee De oa Model 7 179631.58 25661.65 1.04 0.4163 
4 2 657 Error 40 982654. 33 24566. 36 
: Corrected 47 1162285.92 
PROC GLM; Total 
CLASSES TREATMEN PEN; R-Square C.V. Root MSE WEIGHT Mean 
MODEL WEIGHT=TREATMEN 0.154550 22.00455 156.74 712.29 
PEN (TREATMEN) ; 
RUN; Source DF Type I SS Mean Square F Value 
CLASS LEVELS VALUES TREATMEN 3 53943.42 17981.14 0.73 
TREATMEN 4 1234 PEN (TREATMEN ) 4 125688.17 31422.04 1.28 
PEN 2 12 Source DF Type III SS Mean Square F Value 
0 
1 


| NUMBER OF OBS. IN DATA TREATMEN 3 53943.42 17981.14 73 


SET=48 PEN (TREATMEN) 4 125688.17 31422.04 . 


(i) SAS application: SAS GLM instructions and output for the two-way fixed effects 
nested analysis of variance. 


28 


DATA LIST Analysis of Variance-Design 1 
/TREATMEN 1 PEN 3 
WEIGHT 5-8. Tests of Significance for WEIGHT using UNIQUE sums of squares 
BEGIN DATA. 
Source of Variation Ss MS F Sig of F 


WITHIN CELLS 982654. 
PEN WITHIN TREATMEN 125688. -294 
. TREATMEN 53943. . -539 
MANOVA WEIGHT BY 
TREATMEN (1, 4) (Model) 179631. . -416 
PEN (1,2) (Total) 1162285. 
| /DESIGN=PEN WITHIN 
TREATMEN VS WITHIN R-Squared = -155 
TREATMENT VS WITHIN. Adjusted R-Squared = .007 


(ii) SPSS application: SPSS MANOVA instructions and output for the two-way fixed 
effects nested analysis of variance. 


/ INPUT FILE='C: \SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJEL5 . TXT’. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

| VARIABLES=6. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 

1} /VARIABLE NAMES=C1,C2,C3, 

C4,C5,C6. SOURCE ERROR SUM OF  D.F. MEAN 

4 /DESIGN NAMES=T,P,C. TERM SQUARES SQUARE 

LEVELS=4, 2, 6. 1 MEAN C(TP) 24353252. 24353252. 
RANDOM=C. 2 TREATMNT C(TP) 53943. 
FIXED=T, P. 3 P(T) C(TP) 125688. : 1.28 0.2943 
MODEL='T,P(T),C(P)’. | 4 C(TP) 982654. 


SOURCE EXPECTED MEAN ESTIMATES OF 
SQUARE VARIANCE COMPONENTS 


ANALYSIS OF VARIANCE DESIGN 1 MEAN 48(1)+(4) 506847.61927 
INDEX T PC 2 TREATMNT 12(2)+(4) -548.76829 
| NUMBER OF LEVELS 4 2 6 3 P(T) 6(3)+(4) 1142.61389 


# POPULATION SIZE 4 2 INF 4 C(TP) (4) 24566. 35833 
MODEL T, P(T), C(P) 


(iii) BMDP application: BMDP 8V instructions and output for the two-way fixed effects 
nested analysis of variance. 


FIGURE 6.3. Program Instructions and Output for the Two-Way Fixed Effects 
Nested Analysis of Variance: Weight Gains Data for Example of Section 6.11 
(Table 6.5). 


Two-Way Nested (Hierarchical) Classification 383 


DATA MOISTURE; The SAS System 
INPUT BATCHES SAMPLE Coefficients of Expected Mean Squares 
MOISTURE; Source BATCHES SAMPLE ERROR 
fs DATALINES; BATCHES 4 2 1 
1 40 SAMPLE 0 2 1 
ERROR 0 0 1 
Variance Degres of Sum of 
Source Freedom Squares F Value Pr > F 
TOTAL 59 2108.183333 
BATCHES 14 1210. 933333 1.492 0.2256 
SAMPLE 15 869.750000 63.255 0.0000 
ERROR 30 27.500000 
Variance Variance Percent 
Source Mean Square Component of Total 
TOTAL 35.731921 36.577976 100.0000 
BATCHES 86.495238 7.127976 19.4871 
PROC NESTED; SAMPLE 57.983333 28.533333 78.0069 
CLASSES BATCHES SAMPLE; ERROR 0.916667 0.916667 2.5061 
Mean 26. 78333333 
Standard error of mean 1.20066119 


RPNONRPRNN 


ie 
1 
1 
1 
2 
2 
2 
1 2 
1 3 


(i) SAS application: SAS NESTED instructions and output for the two-way random 
effects nested analysis of variance. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: MOISTURE 
/BATCHES 1-2 
SAMPLE 4 Source Type III SS af Mean Square F Sig. 
MOISTURE 6-7. BATCHES Hypothesis 1210.933 14 86.495 1.492 .226 
BEGIN DATA. Error 869.750 15 57.983 (a) 
40 - SAMPLE Hypothesis 869.750 15 57.983 63.255 .000 
39 (BACHES) Error 27.500 30 -917 (b) 
30 a MS(SAMPLE(BATCHES)) b MS(Error) 
30 
26 Expected Mean Squares (a,b) 
Variance Component 
Source Var (BATCHES) Var (SAMPLE (BATCHES) ) Var (Error) 
END DATA. BATCHES 4.000 2.000 1.000 
GLM MOISTURE BY SAMPLE (BATCHES) .000 2.000 1.000 
BATCHES SAMPLE Error .000 .000 1.000 
| /DESIGN BATCHES a For each source, the expected mean square equals the sum of the 
) SAMPLE (BATCHES) coefficients in the cells times the variance components, plus a 
/RANDOM BATCHES quadratic term involving effects in the Quadratic Term cell. b Expected 
SAMPLE. Mean Squares are based on the Type III Sums of Squares. 


(11) SPSS application: SPSS GLM instructions and output for the two-way random 
effects nested analysis of variance. 


FILE='C: \SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE16.TXT’. ~ EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=2. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
NAMES=A1,A2. 

NAMES=B,5S,A. SOURCE ERROR SUM OF D.F. MEAN F PROB. 
LEVELS=15, 2,2. TERM SQUARES SQUARE 

RANDOM=B,5,A. 1 MEAN BATCH 43040.81667 1 43040.817 497.61 0.0000 
MODEL='B,S(B),A(S)’. 2 BATCH S (B) 1210.93333 14 86.495 1.49 0.2256 


3 S(B) A(BS) 869.75000 15 57.983 63.25 0.0000 
4 A(BS) 27.50000 30 0.917 


SOURCE EXPECTED MEAN ESTIMATES OF 
ANALYSIS OF VARIANCE DESIGN SQUARE VARIANCE COMPONENTS 
INDEX BS A 1 MEAN 60 (1) +4 (2) +2(3)+(4) 715.90536 
NUMBER OF LEVELS 15 2 2 2 BATCH 4(2)+2(3)+(4) 7.12798 
POPULATION SIZE INF INF INF 3 S(B) 2(3)+(4) 28 .53333 
MODEL B, S(B), A(S) 4 A(BS) (4) 0.91667 


(111) BMDP application: BMDP 8V instructions and output for the two-way random 
effects nested analysis of variance. 


FIGURE 6.4 Program Instructions and Output for the Two-Way Random Effects 
Nested Analysis of Variance: Moisture Content of Pigment Paste Data for Example 
of Section 6.12 (Table 6.7). 


384 The Analysis of Variance 


DATA BREEDING; The SAS System 
INPUT SIRE DAM WEIGHT; General Linear Models ‘Procedure 


DATALINES; Dependent Variable: WEIGHT 
Sum of Mean 
Source DF Squares Square F Value Pro>F 
Model 11 1920.26007 174.56910 7.35 0.0001 
Error 40 949.79762 23.74494 
Corrected 51 2870.05769 
Total 
R-Square C.V. Root MSE WEIGHT Mean 
0.669067 17.80672 4.87288 27.3654 
Source DF Type I SS Mean Square F Value Pr > F 
SIRE 1669.94341 556.64780 23.44 0.0001 
DAM (SIRE) 250.31667 31.28958 1.32 0.2628 
Source Type III SS Mean Square F Value Pr > F 
SIRE 1594.12974 531.37658 22.38 0.0001 
DAM (STRE) 250.31667 31.28958 1.32 0.2628 
Source Type III Expected Mean Square 
‘ SIRE Var (Error) +4.1311 Var (DAM(SIRE))+12.26 Var(SIRE) 
i PROC GLM; DAM (SIRE) Var (Error)+4.2695 Var (DAM(SIRE) ) 
CLASSES SIRE DAM; 
MODEL WEIGHT = SIRE Tests of Hypotheses for Random Model Analysis of Variance 
y DAM (SIRE) 7 Source: SIRE Error: 0.9676*MS(DAM(SIRE)) + 0.0324*MS (Error) 
1 RANDOM SIRE DAM(SIRE)/TEST; Denominator Denominator 
; DF Type III MS DF MS F Value Pr > F 
LEVELS VALUES 3 531.37658135 8.41 31.045049505 17.1163 0.0006 
1234 Source: DAM(SIRE)Error: MS(Error) 
1234 Denominator Denominator 
. IN DATA DF Type III MS DF MS F Value Pr > F 
8 31.289583333 40 23.744940476 1.3177 0.2628 


dk 
11 
11 
11 
12 
12 
12 
2 
1 2 
1 3 
1 3 
1 3 
1 3 


We 
e 


(i) SAS application: SAS GLM instructions and output for the two-way random effects 
nested analysis of variance with unequal numbers in the subclasses. 


# DATA LIST Tests of Between-Subjects Effects Dependent Variable: WEIGHT 
/SIRE 1 DAM 3 
WEIGHT 5-6. Source Type III Ss df Mean Square FE Sig. 
BEGIN DATA. SIRE Hypothesis 1594.130 3 531.377 17.116 .001 
Error 261.114 8.411 31.045 (a) 
DAM (SIRE) Hypothesis 250.317 8 31.290 1.318 
Error 949.798 40 23.745 (b) 
a .968 MS(D(S))+3.241E-02 MS(E) b MS(Error) 


Expected Mean Squares (a,b) 
Variance Component 
a Te Source Var (SIRE) Var (DAM (SIRE) ) Var (Error) 

74 3 SIRE 12.260 4.131 1.000 
HEND DATA. DAM (SIRE) -000 4.269 1.000 

GLM WEIGHT BY Error -000 .000 1.000 

SIRE DAM a For each source, the expected mean square equals the sum of the | 
/DESIGN SIRE coefficients in the cells times the variance components, plus 

DAM (SIRE) 
| /RANDOM SIRE DAM. 


(i1) SPSS application: SPSS GLM instructions and output for the two-way random 
effects nested analysis of variance with unequal numbers in the subclasses. 


FIGURE 6.5 Program Instructions and Output for the Two-Way Random Effects 
Nested Analysis of Variance with Unequal Numbers in the Subclasses: Breeding 
Data for Example of Section 6.13 (Table 6.9). 


Two-Way Nested (Hierarchical) Classification 385 


FILE='C: \SAHAI BMDP3V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 


\TEXTO\EJE17.TXT’. Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 


VARIABLES=3. DEPENDENT VARIABLE WEIGHT 

/VARIABLE NAMES=SIRE, DAM, WEIGHT. 

/GROUP CODES (SIRE)=1,2,3,4. PARAMETER ESTIMATE STANDARD EST/. TWO-TAIL PROB. 
NAMES (SIRE) =S1,S2,S3,S4. ERROR ST.DEV (ASYM. THEORY) 
CODES (DAM) =1,2,3,4. ERR.VAR. 23.480 5.184 

! NAMES (DAM)=D1,D2,D3,D4. CONSTANT 26.441 3.477 7.603 0.000 

| /DESIGN DEPENDENT=WEIGHT. SIRE 45.601 39.793 


RANDOM=SIRE. DAM (SIRE 2.135 3.779 

RANDOM=DAM, SIRE. 

RNAMES=S, 'D(S)'. TESTS OF FIXED EFFECTS BASED ON ASYMPTOTIC VARIANCE 
METHOD=REML. -COVARIANCE MATRIX 


SOURCE F-STATISTIC DEGREES OF PROBABILITY f 
FREEDOM 
CONSTANT 57.81 1 51 0.00000 


(iii) BMDP application: BMDP 3V instructions and output for the two-way random 
effects nested analysis of variance with unequal numbers in the subclasses. 


FIGURE 6.5 (continued) 


——— 


Bee LITTER; The SAS System 
| INPUT SIRE DAM WEIGHT; General Linear Models Procedure 

DATALINES; Dependent Variable: WEIGHT 

112.77 Sum of Mean 
#1 1 2.38 Source DF Squares Square F Value Pr > F 
H1 2 2.58 Model 9 0. 66328000 0.07369778 1.90 0.1649 
1 2 2.94 Error 10 0.38700000 0.03870000 

er Corrected 19 1.05028000 

5 2 2.48 Total 
; R-Square c.V. Root MSE WEIGHT Mean 
PROC GLM; 0.631527 7.6427022 0.19672316 2.574000 
CLASSES SIRE DAM; Source DF Type III SS Mean Square F Value Pr>F 
| MODEL WEIGHT=SIRE DAM(SIRE); | SIRE 4 0.09973000 0.02493250 0.64 0.6433 
| RANDOM DAM(SIRE) ; DAM (SIRE) 5 0.56355000 0.11271000 2.91 0.0707 
TEST H=SIRE E=DAM(SIRE); Source Type III Expected Mean Square 

RUN; SIRE Var (Error) + 2 Var(DAM(SIRE)) + Q(SIRE) 

CLASS LEVELS VALUES DAM (SIRE) Var(Error) + 2 Var(DAM(SIRE) ) 

5 12345 Tests of Hypotheses using the Type III MS for DAM(SIRE) as 
2 12 an error term 
NUMBER OF OBS. IN DATA Source DF Type III SS Mean Square F Value Pr>F 
| SET=20 SIRE 4 0.09973000 0.02493250 0.22 0.9155 


(i) SAS application: SAS GLM instructions and output for the two-way mixed effects 
nested analysis of variance. 


FIGURE 6.6 Program Instructions and Output for the Two-Way Mixed Effects 
Nested Analysis of Variance: Average Daily Weight Gains Data for Example of 
Section 6.14 (Table 6.11). 


386 The Analysis of Variance 


Tests of Between-Subjects Effects Dependent Variable: WEIGHT 


WEIGHT*5-8(2) | Source Type III SS df Mean Square F Sig. 
BEGIN DATA. SIRE Hypothesis 9.973E-02 -221 .916 
77 Error -564 
.38 DAM (SIRE) Hypothesis -564 2.912 .071 
.58 Error 387 10 
"94 a MS(DAM(SIRE)) b MS(Error) 
.28 
.22 Expected Mean Squares (a,b) 
.01 Variance Component 
; Source Var (DAM(SIRE) ) Var (Error) Quadratic Term 
-48 SIRE 2.000 1.000 Sire 
DATA. DAM (SIRE) 2.000 1.000 
WEIGHT BY Error .-000 1.000 
a For each source, the expected mean square equals the sum of the 
/DESIGN SIRE coefficients in the cells times the variance components, plus a quadraticf 
DAM (SIRE) term involving effects in the Quadratic Term cell. b Expected Mean Squares | 
/RANDOM SIRE. are based on the Type III Sums of Squares. 


| 


(ii) SPSS application: SPSS GLM instructions and output for the two-way mixed effects 
nested analysis of variance. 


| /INPUT | FILE='C:\SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE18.TXT'. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 
VARIABLES=2. 
/VARIABLE NAMES=PIG1, PIG2. 
|/DESIGN NAMES=SIRE, DAM, PIG. 
LEVELS=5, 2, 2. 
_ RANDOM=DAM, PIG. 
FIXED=SIRE. 
MODEL='S,D(S),P(D)'. 


ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 


SOURCE ERROR SUM OF D.F. MEAN F PROB. 
TERM SQUARES SQUARE 


MEAN D(S) 132.509519 1 132.509519 1175.67 0.0000 
SIRE D(S) 0.099730 4 0.024933 0.22 0.9155 
D(S) P(SD) 0.563550 5 0.112710 2.91 0.0707 
P(SD) 0.387000 10 0.038700 


&m WN 


SOURCE 


EXPECTED MEAN ESTIMATES OF VARIANCE 


SQUARE COMPONENTS 
1 MEAN 20 (1)+2 (3)+ (4) 6.61984 
NUMBER OF LEVELS 5 2 2 2 SIRE 4(2)+2(3)+(4) -0.02194 
POPULATION SIZE 5 INF INF 3 D(S) 2 (3)+(4) 0.03700 
MODEL S, D(S), P(D) 4 P(SD) (4) 0.03870 


(iii) BMDP application: BMDP 8V instructions and output for the two-way mixed 
effects nested analysis of variance. 


FIGURE 6.6 (continued) 


the entries of the analysis of variance table. It should be noticed that in each 
case the results are the same as those provided using manual computations in 
Sections 6.11 through 6.14. However, note that in an unbalanced design, certain 
tests of significance may differ from one program to the other since they use 
different types of sums of squares. 


EXERCISES 


1. An experiment was designed to study the ignition rate of dynamite 
from three different explosive-forming processes. Four types of dy- 
namite were randomly selected from each explosive-forming process 


Two-Way Nested (Hierarchical) Classification 387 


and three measurements of ignition rate were made on each type. The 
data in certain standard units are given as follows. 


Explosive 
Process 1 2 3 


Dynamite 1 2 3 4 1 2 3 4 1 2 3 4 


Type 
28.1 23.0 18.3 18.1 23.1 26.3 21.2 38.0 17.1 38.1 41.1 28.0 


32.3 31.1 20.6 19.0 20.2 27.7 246 30.0 18.0 24.5 37.5 32.3 
29.5 23.5 175 16.6 17.6 24.8 20.3 28.0 23.7 27.3 53.6 365 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in ignition rates among the 
three explosive-forming processes. Use a = 0.05. 

(d) Test whether there are differences in ignition rates among dyna- 
mite types within the explosive processes. Use a = 0.05. 

(ec) Estimate the variance components of the model and determine 
95 percent confidence intervals for them. 

2. A manufacturing company wishes to study the tensile strength of 
yarns produced on four different looms. An experiment was designed 
wherein 12 machinists were selected at random and each loom was 
run by three different machinists and two specimens from each ma- 
chinist were obtained and tested. The data in certain standard units 
are given as follows. 


Loom 1 2 3 4 
Machinist 1 2 3 1 2 3 1 2 3 1 2 3 


38.2 53.55 15.3 61.3 41.5 35.3) 47.1 22.5 14.7 15.5 19.3 21.6 
21.6 51.5 26.7 58.3 38.5 27.3 34.3 25.7 26.3 32.3 35.7 26.5 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the tensile strength among 
the four looms. Use a = 0.05. 

(d) Test whether there are differences in the tensile strength among 
machinists within looms. Use aw = 0.05. 

(e) Estimate the variance components of the model. 

3. A manufacturing company wishes to study the material variability of 
a particular product being manufactured on three different machines. 
Each machine operates in two shifts and four samples are randomly 


388 


The Analysis of Variance 


chosen from each shift. The data in certain standard units are given 
as follows. 


Machine 1 2 3 
Shift 1 2 1 2 1 2 


23.5 19.3 25.1 23.55 25.0 27.3 
20.7 20.5 265 21.33 195 26.5 
22.9 21.3 247 22.6 23.4 263 
23.3 19.7 25.3 24.7 22.3 25.8 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the material variability 
among the machines. Use a = 0.05. 

(d) Test whether there are differences in the material variability be- 
tween the shifts within the machines. Use a = 0.05. 


. An industrial firm wishes to streamline production scheduling by 


assigning one time standard to a particular class of machines. An 
experiment was designed wherein three machines are randomly se- 
lected and each machine is assigned to a different group of three 
operators selected at random. Each operator uses the machine three 
times at different periods during a given week. The data in certain 
standard units are given as follows. 


Machine 1 2 3 
Operator 1 2 3 1 2 3 1 2 3 


103.2 104.1 103.8 99.5 102.6 99.7 107.4 106.0 105.4 
104.3. 104.6 102.7 99.8 101.7 101.2 107.6 103.0 104.4 
105.1 103.7 101.5 98.7 103.5 101.7 108.1 104.2 103.7 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in completion time among 
machines. Use a = 0.05. 

(d) Test whether there are differences in completion time betweem 
operators within machines. Use a = 0.05. 

(e) Estimate the variance components of the model and determine 
95 percent confidence intervals for them. 


. A public health official wishes to test the difference in mean fluoride 


concentration of water in a community. An experiment was designed 


Two-Way Nested (Hierarchical) Classification 389 


wherein three water samples were taken from each of three sources of 
water supply and three determinations of fluoride content were per- 
formed on each of the nine samples. The data in milligrams fluoride 
per liter of water are given as follows. 


Supply 1 2 3 
Sample 1 2 3 1 2 3 1 2 3 


17 619 18 19 #19 20 24 27 28 
18 17 #16 20 2.1 18 26 26 25 
19 618 0634.7 2.2 22 19 25 #28 £2.46 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there is no difference in mean fluoride 
content between samples at a source of water supply. Use a = 
0.05. 

(d) Test the hypothesis that there is no difference in mean fluoride 
concentration between the sources of water supply. Use a = 
0.05. 

6. A nutritional scientist wishes to test the difference in protein levels in 
animals fed on different dietary regimens. An experiment is designed 
wherein four animals of a certain species are subjected to each of two 
dietary regimens and three samples of blood are drawn from each 
animal to determine the protein content (in mg/100 ml blood). The 
data are given as follows. 


Dietary 
Regimen 1 2 


Animal 1 2 3 4 1 2 3 4 


3.44 351 3.54 352 4.88 491 4.79 4.95 
3.46 3.50 3.52 353 4.84 489 481 4.93 
3.47 347 3.59 357 485 487 482 4.92 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there are no differences in mean protein 
levels between animals fed on a given dietary regimen. Use a = 
0.05. 

(d) Test the hypothesis that there are no differences in mean protein 
levels between the two dietary regimens. Use a = 0.05. 

7. An experiment is designed involving a two-stage nested design with 
the levels of factor B nested within the levels of factor A. The relevant 
data are given as follows. 


390 


44.3 41.0 44.3 41.0 38.7 42.1 
44.3 43.2 44.3 42.1 38.7 42.1 


The Analysis of Variance 


Factor A 1 2 3 
Factor B 1 2 1 2 3 1 2 


12,2- 3.3- 2 8.3 7.0 13.2 5.7 
10.1 73 133 102 63 93 6.2 
14.2 15.1 9.1 3.5 10.1 

12.5 


(a) Describe the model and the assumptions for the experiments. It 
is assumed that both factors A and B are fixed. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the levels of factor A. Use 
a = 0.05. 

(d) Test whether there are differences in the levels of factor B within 
A. Use a = 0.05. 


. A study is conducted to investigate whether a batch of material is 


homogenous by randomly sampling the material in five different 
vats from the large number of vats produced. Three bags are chosen 
at random from each vat. Finally, two independent analyses are made 
on each sample to determine the percentage of a particular substance. 
The data in certain standard units are given as follows. 


39.9 45.4 46.5 38.7 38.7 46.5 36.5 36.5 38.9 
38.7 44.3 45.4 37.6 39.9 46.5 36.5 35.4 39.8 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there is no difference in mean percentage 
values between bags within a vat. Use a = 0.05. 

(d) Test the hypothesis that there is no difference in mean percentage 
values between vats. Use a = 0.05. 

(e) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 


. Hicks (1956) reported the results of an experiment involving strain 


measurements on each of four seals made on each of the four heads 
of each of the five sealing machines. The coded raw data are given 
as follows. 


Two-Way Nested (Hierarchical) Classification 391 


Machine 1 2 3 4 5 


Head 12 3 41 23441 2341 23441 23 «4 


6 13 1 7 10 2 4 0 0 10 8 7 11 5 10 16 3 3 
2 3 100 4 9 1 1 3 0 11 5 2 0 10 8 8 470 +7 
0 9 07 7 174 5 60 5 6 9 670 2 4 
8 8 6 9 12 10 9 15 77 4 #4 459 3 2 0 


Source: Hicks (1956, p. 14). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there is no difference in mean strain 
values between the heads within a machine. Use a = 0.05. 

(d) Test the hypothesis that there is no difference in mean strain 
values between the machines. Use a = 0.05. 

10. Sokal and Rohlf (1995, p. 294) reported data from an experiment 
designed to investigate variation in the blood pH of female mice. 
The experiment was carried out on 15 dams that were mated over a 
period of time with either two or three sires. Each sire was mated 
to different dams and measurements were made on the blood pH 
reading of a female offspring. The following data refer to a subset 
of 5 dams which have been randomly selected from 15 dams in the 


experiment. 
Dam 1 2 3 4 5 
Sire 1 2 1 2 1 2 3 1 2 1 2 3 
pH 748 748 7.38 7.37 7.41 7.47 7.53 7.39 7.50 7.39 7.43 7.46 


Reading 7.48 7.53 7.48 7.31 7.42 7.36 7.40 7.31 744 7.37 7.38 7.44 
7.52 743 746 745 7.36 743 744 7.30 740 7.33 7.44 7.37 
7.54 7.39 7.41 7.47 738 740 7.41 7.45 7.43 7.54 


Source: Sokal and Rohlf (1995, p. 294). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in mean pH readings between 
dams. Use a = 0.05. 

(d) Test whether there are differences in mean pH readings between 
sires within dams. Use a = 0.05. 

(e) Estimate the variance components of the model. 

11. Marcuse (1949) reported results from an experiment designed to 1n- 
vestigate the moisture content of cheese. The experiment was 


392 The Analysis of Variance 


conducted by sampling three different lots from the large number 
of lots produced. Two samples of cheese were selected at random 
from each lot. Finally, two subsamples per sample were chosen and 
independent analyses made on each subsample to determine the per- 
centage of moisture content. The data are given as follows. 


Lot 1 2 3 
Sample 1 2 1 2 1 2 


39.02 38.96 35.74 35.58 37.02 35.70 
38.79 39.01 35.41 35.52 36.00 36.04 


Source: Marcuse (1949). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there are no differences in mean percent- 
age values of the moisture content between samples within lots. 
Use a = 0.05. 

(d) Test the hypothesis that there are no differences in mean percent- 
age values of the moisture content between lots. Use a = 0.05. 

(e) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 

12. Sokal and Rohlf (1995, p. 276) reported data from a biological ex- 
periment involving 12 female mosquito pupae. The mosquitos were 
randomly assigned into three rearing cages with each cage receiving 
4 pupae. The reported responses are independent measurements of 
left wings of the mosquito and the data are given as follows. 


Cage 1 2 3 
Mosquito 1 2 3 4 1 2 3 4 1 2 3 4 


58.5 77.8 84.0 70.1 69.8 56.0 50.7 63.8 566 77.8 69.9 62.1 
59.5 80.9 83.6 683 698 545 49.3 65.8 57.5 79.2 69.2 64.5 


Source: Sokal and Rohlf (1995, p. 276). Used with permission. 


(a) Describe the model and assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Testthe hypothesis that there are no differences in mean measure- 
ment values between mosquitos within a cage. Use a = 0.05. 

(d) Test the hypothesis that there are no differences in mean mea- 
surement values between cages. Use a = 0.05. 

(e) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 


Two-Way Nested (Hierarchical) Classification 


393 


13. Steel and Torrie (1980, p. 154) reported data from a greenhouse ex- 
periment that examined the growth of mint plants. A large group of 
plants were assigned at random to pots with each pot receiving four 
plants. Treatments were randomly assigned to pots with each treat- 
ment receiving three pots. There were 6 fixed treatments representing 
combinations of cross factors with 3 levels of hours of daylight and 2 
levels of temperatures. Observations were made on individual plants 
where the response variable was the one week stem growth of the 
mint plant. The data are given as follows. 


Treatment* 


Pot 1 


Treatment* 


Pot 1 


8.5 
6.0 
9.0 
8.5 


Source: Steel and Torrie (1990, p. 


6.5 
7.0 
8.0 
6.5 


7.0 
7.0 
7.0 
7.0 


6.0 
5.5 
3.5 
7.0 


6.0 
8.5 
4.5 
7.5 


154). Used with permission. 


6.5 
6.5 
8.5 
a3 


7.0 
9.0 
8.5 
8.5 


6.0 
7.0 
7.0 
7.0 


11.0 
7.0 
9.0 
8.0 


* Treatments representing combinations of hours of daylight 


and temperatures are defined as follows: 


(a) Describe the model and assumptions for the experiment. 
(b) Analyze the data and report the analysis of variance table. 


Temperature 


Low 
High 


8 


I 
4 


12 


2 
5 


Hours of Daylight 


16 


3 
6 


(c) Test the hypothesis that there are no differences in mean stem 


growths between pots within a treatment. Use a = 0.05. 


(d) Test the hypothesis that there are no differences in mean stem 


growths between the treatments. Use a = 0.05. 


(e) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 


394 The Analysis of Variance 


(f) Estimate the contrast representing the difference between two 
treatments defined as the low temperature—8 hours and high 
temperature—8 hours and set a 95 percent confidence interval. 

(g) Estimate the contrast defined as: Low—8 + Low-12 + Low-— 
16 — High— 8 — High-12 — High—16 and set a 95 percent con- 
fidence interval. 

14. Sokal and Rohlf (1994, p. 364) reported data from an experiment 
designed to investigate the effects of breed and maturity of pure-bred 
cows on butterfat content. Five breeds of pure-bred dairy cattle were 
taken from Canadian records and random samples of 10 mature (>5 
years old) and 10 two-year-old cows were selected from each of five 
breeds. The following data give average butterfat percentages for 
each cow. 


Breed Ayrshire Canadian Guernsey _ Holstein-Fresian Jersey 


Cow Mature 2-yr Mature 2-yr Mature 2-yr Mature = 2-yr Mature 2-yr 


3.74 4.44 3.92 4.29 4.54 5.30 3.40 3.79 480 5.75 
4.01 4.37 4.95 5.24 5.18 4.50 3.55 3.66 6.45 5.14 
3.77 4.25 447 4.43 5.75 4.59 3.83 3.58 5.18 5.25 
3.78 3.71 4.28 4.00 5.04 5.04 3.95 3.38 449 4.76 
410 408 4.07 4.62 4.64 4.83 4.43 3.71 5.24 5.18 
406 3.90 4.10 4.29 4.79 4.55 3.70 3.94 5.70 4.22 
4.27 4.41 4.38 4.85 4.72 4.97 3.30 3.59 5.41 5.98 
3.94 4.11 3.98 4.66 3.88 5.38 3.93 3.55 4.77 4.85 
4.11 4.37 446 4.40 5.28 5.39 3.58 3.55 5.18 6.55 
4.25 3.53 5.05 4.33 4.66 5.97 3.54 3.43 5.23. 5.72 


Source: Sokal and Rohlf (1994, p. 364). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 
Would you use Model I, Model II, or Model III. Explain. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test the hypothesis that there are no differences in average per- 
centage values of the butterfat content between mature and two- 
year old cows. Use a = 0.05. 

(d) Test the hypothesis that there are no differences in average per- 
centage values of the butterfat content between the breeds. Use 
a = 0.05. 

(e) If you assumed any of the factors to be random, estimate the 
variance components of the model and construct 95 percent con- 
fidence intervals on them. 


7 Three-Way 
| and Higher-Order 
Nested Classifications 


7.0 PREVIEW 


The results of the preceding chapter can be readily extended to the case of 
three-way and the general g-way nested or hierarchical classifications. As an 
example of a three-way nested classification, suppose a chemical company 
wishes to examine the strength of a certain liquid chemical. The chemical is 
made in large vats and then is barreled. To study the strength of the chemical, 
an analyst randomly selects three different vats of the product. Three barrels are 
selected at random from each vat and then three samples are taken from each 
barrel. Finally, two independent measurements are made on each sample. The 
physical layout can be depicted schematically as shown in Figure 7.1. In this 
experiment, barrels are nested within the levels of the factor vats and samples 
are nested within the levels of the factor barrels. This 1s the so-called three-way 
nested classification having two replicates or measurements. In this chapter, we 
consider the three-way nested classification and indicate its generalization to 
higher-order nested classifications. 


7.1 MATHEMATICAL MODEL 


Consider three factors A, B, and C having a, b, and c levels respectively. The 
b levels of factor B are nested under each level of A and c levels of factor C are 
nested under each level of factor B (within A), and there are n replicates within 
the combination of levels of A, B, and C. The analysis of variance model for 
this type of experimental layout is taken as 


i 
Vijke =U +Qi+ Bi) + YVeGj) + Ceajk) : 
£ 


where yz is the general mean, a; is the effect due to the i-th level of factor A, 


H. Sahai et al., The Analysis of Variance 395 


GQ Ravin Gor SR AAA res Rive amece Miers Nace Varke2nnt 
© springer Science+ Business Media New York 2000 


396 The Analysis of Variance 


] i Ill 
Vats (pf = O ) 
Barrels nm a | i = || pa = lea 
Samples LIT WOOHOO DUOOUOOOUOOUOD 
Measurements 


FIGURE 7.1 A Layout for the Three-Way Nested Design Where Barrels Are Nested 
within Vats and Samples Are Nested within Barrels. 


Bj) 1s the effect due to the j-th level of factor B within the i-th level of factor 
A, Yeujy 18 the effect due to the k-th level of factor C within the j-th level of 
factor B and the i-th level of factor A, and é¢(;;,) 1s the error term that represents 
the variation within each cell. 

When all the factors have systematic effects, Model I is applicable to the data 
in (7.1.1). When all the factors are random, Model II is appropriate; and when 
some factors are fixed and others are random, a mixed model or Model III is the 
appropriate one. The assumptions under Models I, II, and III are exact parallels 
to that of the model (6.1.1). For example, 1f we assume that A is fixed and B 
and C are random, then a@;’s are unknown fixed constants with the restriction 
that 4 a; = 0, and Bjciy’s, Vecij)’S, aNd ee jx)’S are mutually and completely 
uncorrelated random variables with zero means and variances o2, a and 0? 
respectively. 


7.2 ANALYSIS OF VARIANCE 


The calculations of the sums of squares and the analysis of variance for the 
three-way nested design are similar to the analysis for the two-way nested 
design presented in Chapter 6. The formulae for the sums of squares together 
with their computational forms are simple extensions of the formulae for the 
two-way nested design given in Section 6.7. Thus, starting with the identity 


Yijne — Yi... = Vi — Wud + iz... — Vid + Wijk. — Viz.) + Vijce — Viaje.) 
the total sum of squares is partitioned as 


SSr = SS4 + SSacay + SSccay + SSe, 


Three-Way and Higher-Order Nested Classifications 397 


where 


a b Cc n a b Cc n 9) 


SS;r = ae > (Vijne — ¥...)° = > By Vine — ua 


f=) j=l k=l £=1 i=1 j= 


a _ = 1 a y 
SS4 = ben Y(5i,, — 5." = — Do yp - >, 
c L 


| el 
Ne nen 
ll 
pena 
ir 
ema 


a 


b a 
SS 3A) =n) > ij, — Vi. ah ~~ — yoy? 
i=] 


i=l] j=l a= 
a bee 1 


b 
SSc(a) =n > > SY Wiik. — Fix = i »> 


(==1. J=1 -k=1 i=1 J=1 k=1 f=1.-g=1 


and 
a b c n a b c n l a b c 
2 
SS- = > (Yijke — Viney = = yo, Vij ike >» Yijk.» 
i=l j=l k=) C=} ml gal k=l f= 1 a i=l. j=t kal 
with 
n 
ijk. = ) Vijkes Vijk. = Vijk./N, 
t=1 
Cc n 
ij. = S S Vijkes Vij. = Yij../en, 
k=1 @=1 
n 


Cc 
2 Vijkes Yi... = Yi.../ben, 
k=l 


a b 
y= Bye Yijkey and y= y../aben. 


It should be noticed that the nested sums of squares are related to the sums 
of squares for main effects and interactions, considering all the factors being 
crossed, as follows: 


SSacay = SSp+SSaz, SScwey = SSc + SSac + SSac + SSazc- 


The expected mean squares can be obtained by proceeding directly as earlier 
or using the general rules for obtaining the expected mean squares. The resultant 
analysis of variance is summarized in Table 7.1. There are various types of mixed 
models that may arise. The analysis of variance table contains the expectations 
of mean squares for the case when A and B are fixed and C is random. If we 


The Analysis of Variance 


398 


7° 70 70 
1=y1=f I=! ( (T= am 
ou + 70 fou + 70 ie OE OE Cra + 70 
[=f |=! (I —q)p = 1=! (I-49) 
(yf = A d A (if = a 
if <<a + eet 79 fous + Zou + Zo Xf 5 te 
q oD q 0 
lp 1=! |—D 
! = A a i = a 
see a +fout+ 20 Pourq+ § Pour + ou + Zo ie a + 20 
wopury > ‘poexi4 g pue vy wopuey > pue “g “‘y poxi4 > pure “g “vy 
HI JsPOW I} JPPOW 1 }PPOW 


asenbs uvaw payadx 


ASW 


(NIGH 


WAS! 


YSW 


aaenbs 


uraw 


LSS 
4Ss 


(MISs 


(Wags 


YSS 


sauenbs 


jo wins 


| — uoqv 


(I — ujoqv 


(1 — 2)qv 


(I — 9) 


| —09 


wopaai4 


JO saaidaq 


[e10L 


JOU 


q UIyWM “) 


V UIyIM g 


y oJ ong 


UOIJELLA 


yO ad4N0S 


(L°L'Z) JAPOW 40} aDUBLIRA Jo siskjeuYy 


LZ ATaVL 


Three-Way and Higher-Order Nested Classifications 399 


assume a mixed model with factor A having a systematic effect and factors B 
and C having random effects, then analysis is the same as under Model II except 
that 0 is now replaced by )“7_, @?/(a — 1). The remaining four cases that fall 
under the mixed model (1.e., those in which factors B and C have opposite 
effects) are left as an exercise. It should, however, be pointed out that the term 
involving 8j(;) disappears from the expectation of MS, when factor B has a 
systematic effect, and the term involving y,(;;) disappears from the expectations 
of MS, and MSz,4) when factor C has a systematic effect. Thus, special care 
is needed when determining the appropriate tests to be made. 


7.3. TESTS OF HYPOTHESES AND ESTIMATION 


Under the assumption of normality, the four mean squares are independently 
distributed as multiples of a chi-square variable such that the ratio of any two 
mean squares is distributed as the variance ratio F’.! The expected mean square 
column of Table 7.1 suggests the proper test statistics to be employed for 
testing the particular hypotheses of interest. Thus, under Model I, F tests for 
the effects of all three factors can be performed by dividing the corresponding 
mean squares by MSzg. Under Model I], tests for the existence of the main 
effects of factor A and the two nested factors B and C all exist. The factor 
A effect 1s tested by means of the ratio MS4/MSa,4), factor B by the ratio 
MSacay/MScva), and factor C by the ratio MSc,a)/MS<¢.? Under Model III, 
with factor A having a systematic effect and factors B and C having random 
effects, the tests for all the factor effects would be the same as indicated under 
Model II. The tests for other variations of mixed models are obtained similarly. 

The variance components for various model factors are readily estimated 
by using the customary analysis of variance procedure. For example, under 
Model II, the desired estimators are: 


BS — MSe, 
3 = (MScia) — MSz)/n, 
53 = (MSa(4) — MSccay)/cn, (7.3.1) 


and 
52 = (MS, — MSa:ay)/ben. 


The estimators (7.3.1) are the so-called best unbiased estimators as discussed 
before; but the estimates for a7, OB, and o, can be negative. It should be 


| For a proof of this result see Scheffé (1959, pp. 251-254). 
These are all exact F tests and their power is readily expressed in terms of the (central) F 
distribution. 


400 The Analysis of Variance 


noticed that the estimation of variance components is especially simple for 
hierarchically nested designs. One simply obtains the difference between the 
mean squares for the factor involving the variance component of interest and 
the one following it; and the resulting difference is divided by the coefficient 
of the variance component in the expected mean square. An exact confidence 
interval for o2 can be constructed by noting that abc(n — 1)MS_/o? has a 
chi-square distribution with abc(n — 1) degrees of freedom.’ 

Under Model III, with A fixed and B and C random, exact confidence inter- 
vals on means, 4 + @;’s, and a linear combination of means, a= £;(u+a;), 
can be obtained as in Section 6.6. Thus, exact 100(1 — @) percent confidence 
intervals for uz + a; and Se £;(u4 + a@;) are given by 


MS aa) 
bcn 


yi... # tla(b — 1), 1 — a/2] 


and 


> 4.. £tla(b — 1), 1 — @/2] 
i=l 
respectively.* 


7.4 UNEQUAL NUMBERS IN THE SUBCLASSES 


Consider three factors A, B, and C where B is nested within A and C is nested 
within B. Suppose each A level has b; B levels, each B level has c;; C levels, 
and nj;;, samples are taken from each C level. Here, the model remains the 
same as (7.1.1), where? = 1,2,...,a; 7 = 1,2,...,0;,4 = 1,2,..., 04; 


and € = 1,2,..., njj,x. The total number of observations is 
a bij 
NS ee 
i=] j=l k= 


3 Fora ea of meee: for constructing conndence ulervals Ls individual variance com- 
ponents oY , Of and o2, the total variance 02 + o2 + of + G2, the ane aise ao? plop. 
and the proportions of variability ve | oe + 02), of /(a2 + o2), ve Mee ate o} + og + a2), 

a7 /(oZ +o2 p+0% +02), og /(a? +o, +03 peay and a2 /(a2 +o + Of gt), see Burdick 
aiid Graybill (1992, pp. 92-96). 

4 Formulae for selecting b, c, and n to minimize the cost of obtaining a sample have been derived 

by Marcuse (1949) and Vidmar and Brunden (1980). 


Three-Way and Higher-Order Nested Classifications 401 


Now, the sums of squares in the analysis of variance are computed as follows: 


a 2 2 
A= ie (Fi... - 9.) = a as 
ree Te N 
i=] 
a »b; 2 a 9) 
ij. Jie. 
SS3(A) = : y ni; (Vij. — Vi.) = y y = y SS 
is ee i=l j=] Nij. = Nj. 
a Cij a Cij a 2 
bj bj Vik. i Vij. 
SSc(s) = ) >: } a Nijk(Vijk. — Vij. y= = ) ) 
i=1 j=1 k= Pl gat ea. Uk yay Tay 
and 
a bj Cij_  Nijk 
fa) 
SS$_ = (Vijke — Vijk.) 
i=1 j=l k=1 £=1 
a ob Cig Mj a bh ij 2 
= 2 Yijk. 
a Yijke — oat 
i=1 j=1 k=1 @=1 i=1 j=l k=l © UF 


where the customary notations for totals and means are employed. The resultant 
analysis of variance is summarized in Table 7.2. The derivations for expected 
mean squares can be found in Ganguli (1941) and Scheffé (1959, pp. 255-— 
258). The coefficients of variance components under the expected mean square 
columns are determined as follows: 


_ N—kg 
a Oy 
N—kq 
iz = ——., 
b-—a 
Pe es and 
a—|1 
where 
a 
aR 
S 
ki = >on; /N, 
- be Cj 
ks = DDD min/N: 
i= ede 1 k=1 
bi Cj 
de dni and 
i=l j=1-k= 


ke — ks 

a 

_ ks — x3 

aes a-1’ 

. N —k, 

a 
ab; 

C= Cij> 
i= nae 1 

m= Yah | 
ms 

k= Domb /m, 
a. 


bj Cij 


tea Yoda /my 


t=1. J=1Lk= 


The Analysis of Variance 


402 


‘9 = to"lu ©? StI] OPOW! JopunN juIeNsuOd opis SUL x, 


Deeps iit p= fg = Mitty Ne pues Zp = 9 = Og fu AE? tg = tovtu EZ care | [epow Jopun syurensuos apis oy, , 


Ny 


LSS I—N [BIOL 
1=f (=! 
70 70 70 ASW Iss M9 << “<< —N IOuq 
'q D 
1=! 1=f =! 
Solu + 70 Solu + 70 —__=—— ae 70 (NISWw (MISs qa < _ na" ¢ “< g ulyyM 2 
(D4 yy cs is 
eee CeCe 
1=! 
fotu + forty + 20 dotu + tory + 20 =f — +jo Magn = Wigs p—'g € yung 
OM gliy ZZ Q 
tu C4 
'q D 
1a + fosu+ foru+ 20 Poy + Fosu+ tory + 20 1 + jo YSN VSS [-v Vy 0 ang 
pm put 
D 0 
wopury D pue g ‘paxi4 y wopuey > pue “g ‘vy poxi4 D pure ‘g “‘y aaenbs = sauenbs wopsa.4 UONJELILA 
SSS SS OSS SS SSS Se ueaWw jo wins JO saaisaq gO 904N0S 
+« ITT [PPOW Il |PPOW «1 JPPOW 


aaenbs ureaw paydadxq 


SaSSPjIQNS ay} Ul SAquINN jenbauyg SUIAjOAU] (L°L"Z) J2BPOW 40} JoUBLILA JO sISA}eUY 
CZ AIIVL 


Three-Way and Higher-Order Nested Classifications 403 


From Table 7.2, it is evident that no simple F tests are possible except for 
testing factor C effects. Again, this 1s so because under the null hypothesis 
we do not have two mean squares in the analysis of variance table that esti- 
mate the same quantity. Furthermore, the mean squares other than MS, are 
not distributed as constant times a chi-square random variable, and they are 
not statistically independent except that MS, is independent of the other mean 
squares. For the approximate tests of significance, we determine the coefficients 
of the variance components, calculate estimates of the variance components, 
and then determine linear combinations of mean squares to use as the de- 
nominators of the F ratios in the Satterthwaite approximation. The variance 
components estimates as usual are obtained by solving the equations obtained 
by equating the mean squares to their respective expected values. These are 
the so-called analysis of variance estimates (Mahamunulu (1963)). The for- 
mulae for these estimators including the expressions for their sampling vari- 
ances are also given in Searle (1971b, pp. 477-479) and Searle et al. (1992, 
pp. 431-433). 

An exact confidence interval for o2 can be obtained as in Section 6.10. 
Burdick and Graybill (1992, pp. 109-116) indicate a method for constructing 
confidence intervals for 07, 07, and oY with a numerical example. 


7.5 FOUR-WAY NESTED CLASSIFICATION 


In this section, we briefly review the analysis of variance for the four-way nested 
classification which 1s the obvious extension of the three-way nested analysis 
of variance. The model is 


i b dehslenl 
(i eee 2, 

Yijkem = +a; + Byiy + Vecjy + Seajey + Cmijeey 4 K=1,--.,€ (75.1) 
{= I, ,d 
m= 1, Jn 


where the meaning of each symbol and the assumptions of the model are readily 
stated. Starting with the identity 


Yijkem — Y..... = VW.... — Yu...) + ij... — Vi.) + Dijk. — Jij.) 
+ (Vijke. — Vijk..) + OVijeem — Vijxe.)s 


the total sum of squares 1s partitioned as 


SS7r = SS, + SSB) + SSc(B) + SS pc) + SSzr, 


404 The Analysis of Variance 


where 
a b Cc d n 
SSr =>) YS dD Oiinem — 5...) 
i=1 j=l k=1 €=1 m=1 
2p) Be: ! 
2 2 
a Vijkem —~ poy Moe? 
=e = = abcdn 
a 
SS4 = bedn )(9;... — 5... 
i=1 
Tze 1 
= ye = eee 
bcdn =] abcdn 
a b 
SSpa) = cdn >: YWiz.. — 9...) 
i=) j=] 
1 a b 1 a 
a 2 2 
ae oh. pag 
f=1 J=1 i=] 
a b c 
SScca) = dn Y) > SY Ginn. — 5iy...° 
i=l j=l k=1 
ain Dy 2k can 2 GM 
i=1. j=). k=1 i=) j=) 
a b c d 
SSpic) =n » > Cie. — Vij.) 
i=1 j=l k=1 €=1 
1 a b c d ; l a b c ; 
>) 3h ee 
f=! y=1 k=! f=) i=]. j=1 k=1 
and 
a b Cc d n 
SSE = > > (Vijkem — Vijke.) 
i=1 j=1 k=1 €=1 m= 
a b Cc d n l a b Cc d 
2 2 
= = De Vijktm ~ 7 » > Yinne 
i=1 j=1 k=1 €=1 m=1 i=1 j=l k=1 €=1 


with the usual notations of dots and bars. The corresponding mean squares 
denoted by MS,4, MSa,4), MSc), MS pic), and MS< are obtained by dividing 
the sums of squares by the respective degrees of freedom. The resultant analysis 
of variance is summarized in Table 7.3. The nested classifications having more 
than four factors have the analysis of variance tables with the same general 
pattern. 


405 


Three-Way and Higher-Order Nested Classifications 


“SS = - [| — upoqn [210], 
Me 70 72 FSW SS (I — ¥)poqn 1oug 

[=? [=7 [=f (=! (| _ = 
Sou + 20 fou+ 20 (Hg < < < Ka +70 Odg~ Odgg (1 —p)oqv Juming 

=a [=f [=! (I- a 
soup + pou + zo “oup of: pou + 70 ON €é < <a + 70 (MIGnM, (Ig (I-2)qo gumim 7 

, EEE!  ~@)n SI a 
nly ae eee Te 4 “up + fou + 7o Soups + soup + Jou + fo OF « im +70 WAgn~ Wags (I-9)o ypumimng 

q D 
poupoq + 
I=! cag : ’ [=! [—D 
A a a 1 a 

Pm upoq ae 2oup + pou at 72 z0Uupr + ,0up+ pou + 70 30 A ip4 + 7 VOI LAN [—oD y 01 ong 
wopuey Gg pure > ‘paxiy g pue y wopury q pue ‘9 ‘g “vy poxi4 gq pue ‘5D ‘g “vy asenbs saienbs wopaasy UOIJELIPA 
ueaw jowns 4josaaisaq 40 ad1n0¢S 


Il lSPOW 


Il PPOW | |PPOW 


aienbs ueaw pa}dadx3 


(L°S°Z) JAPOW 40) adueLieA Jo siskyeuy 


24 418V1 


406 The Analysis of Variance 


The expected mean square column of Table 7.3 suggests the proper test 
statistics to be employed for testing the particular hypotheses of interest. For 
example, under Model II, note that each expected mean square contains all 
the terms of the expected mean square that follows it in the table. Thus, an 
appropriate F statistic to test the statistical significance of any factor effects 
is determined as the mean square of the factor of interest divided by the mean 
square immediately following it. The unbiased estimators of the variance com- 
ponents are also readily obtained using the customary analysis of variance pro- 
cedure. In particular, the best unbiased estimators of the variance components 
under Model II are obtained simply by using the differences between the mean 
square of the factor of interest and the one immediately following it; that is, 


6? = MSz, 

6; = (MSpvc) — MSz)/n, 

6) = (MScia) — MSp@)/dn, 
é5 = (MSa (4) — MScay)/cdn, 


and 
5? — (MS, — MSa,a))/bcdn. 


As in Section 7.3, an exact confidence interval for a? can be obtained by 
noting that abcd(n — 1)MS_z/ oa? has a chi-square distribution with abcd(n — 1) 
degrees of freedom. For a discussion of methods for constructing confidence 
intervals for other variance components, including certain sums and ratios of 
variance components, see Burdick and Graybill (1992, pp. 92-95). 


7.6 GENERAL g-WAY NESTED CLASSIFICATION 


The results of a nested classification can be readily generalized to the case of qg 
completely nested factors. Such a design is also called a (¢q + 1)-stage nested 
design. The general g-way nested classification model is a direct extension of 
the model (7.5.1) and can be written as 


Pes Loewe 

eo en 2, 

Vijk...par = H+ Byiy + Yejy £2 ++ + SgGijk...p) + ertijk..pqg) YX = 1,25---5€ 
a — a 
(7.6.1) 
where jz is the general mean, a, Bj(i), Ve(ij)s - -- » Sg(ijk...p) are the effects due to 


the i-th level of factor A, the j-th level of factor B, the k-th level of factorC,..., 


Three-Way and Higher-Order Nested Classifications 407 


the q-th level of factor Q, and é,(ijx...pg) 18 the customary error term that rep- 
resents the variation within each cell. The assumptions of the model (7.6.1) 
are readily stated depending upon whether the levels of factors A, B,C,..., Q 
are fixed or random. For example, under Model II, the a;’s, Bji)’s, yeu jy’s, 
.. +) Og(ijk...p) Ss ANd €,(i jk... pqy’S are independent (normal) random variables with 
means of zero and variances 07, 07, a. ..., and 02, respectively. For this bal- 
anced model, all the mean squares are distributed as constant times a chi-square 
random variable and they are statistically independent. Under the assumptions 
of the model of interest, the results on tests of hypotheses and estimation of 
variance components can be obtained in the manner described earlier. For a 
discussion of methods for constructing confidence intervals for the variance 
components, see Burdick and Graybill (1992, Section 5.3). For the analysis of 
a general g-way nested classification with unequal numbers in the subclasses, 
we use the same approach as given in Section 7.4. The details on analysis 
of variance, tests of hypotheses, and variance components estimation can be 
found in Gates and Shiue (1962) and Gower (1962). Khuri (1990) presents 
some exact tests for random models when all stages except the last one are 
balanced. 


7.7 WORKED EXAMPLE FOR MODEL II 


Brownlee (1953, p. 117) reported data from an experiment carried out to deter- 
mine whether a batch of material was homogeneous. The material was sampled 
in six different vats. The matter from each vat was wrung in a centrifuge and 
bagged. Two bags were randomly selected from each vat and two samples were 
taken from each bag. Finally, for each sample, two independent determina- 
tions were made for the percentage of an ingredient. The data are given in 
Table 7.4. 

The experimental structure follows a three-way nested or hierarchical classi- 
fication and the mathematical model is 


l 
Yijke = WO; + Bic) + Vecijy + Cecjey : 
£ 


where yjjxe 1s the £-th determination (analysis) of the k-th sample, of the j-th 
bag and for the i-th vat, jz is the general mean, a; is the effect of the i-th vat, 
Bj) 18 the effect of the j-th bag within the i-th vat, yj;;) is the effect of the k-th 
sample within the j-th bag within the i-th vat, and e¢(;;,) 1s the customary error 
term. Furthermore, in this example, it is reasonable to assume that all factors 
are random and thus the a;’s, Bji)’s, Yecijy’S, and @¢(;jx)’S are all independently 
and normally distributed with mean zero and variances ae, OR» a. and oa, 


respectively. 


The Analysis of Variance 


408 


‘UOISSIUIAd YIM Pasn “(LIT “d ‘E€S6[) aaruMOIg <aaInog 


THz SIZ €€7 pEZ SIZ 977 “TA 
611 €7I SOI Ell Pil 611 EIT IZ1 ZOI ell ell ELI “MK 
09 6 €9 09 t@ £6 pF 65 86 9S I19 86 I19 7% 66 7% 6r €6 SS 86 SS 8g Ss g¢ fK 


6¢ 6¢ I€ 6@ 9% SZ Le OF OF B82 OF 6% OF St 6% CE PH 9% LZ 6% LZ OF 8% 6% sSuUOHeUIWMA}IG 


Go i Ve fF © be a. bk S tt @ ££ S LT BB LTS te ££ SVS sajduies 


OS LS 8mm es mcr cee 


Z L Z Ll Z L ré L Z L Z L sseg 


S}UA 


jelayew JO ydj}eg & JO JUaIPassu] JO asejUadIag 
VLAMWVL 
a ee ee 


Three-Way and Higher-Order Nested Classifications 409 


Using the computational formulae for the sums of squares given in Section 7.2, 
we have 


2 
SSr = (29)? + (29)? +--- + (31) — a 
= 39,156 — 38,988 
= 168.000, 
$5, = (226) + (215)? +---+(242" — (1,368) 
2x 2% 2 6x2x2x2 
= 39,054.250 — 38,988 
= 66.250, 
Sa = (113) (113) eee C19) (226)* + (215)? +--+ + (242) 
2x2 2 ee 2 
= 39,090.500 — 39,054.250 
= 36.250, 
(58)? + (55)* +---+ (60)? (113)? + (113)? +--- +119)" 
66 = eS Se ee 
2 2x2 
= 39,136 — 39,090.500 
= 45.500, 
and 


58)? 55)" Sessa 60)2 
SSz = 29) + 297? +... 4 a? - SE EET EE 


= 39,156 — 39,136 
= 20.000. 


These results along with the remaining calculations are summarized in 
Table 7.5. The test of the hypothesis Ho a, = () versus Hy See oy > 0 
gives the variance ratio of 4.55 which is highly significant (p < 0.001). The test 
of the hypothesis Hy (4). 2 = 0 versus H?™: of > 0 gives the variance ratio 
of 1.59 which is not significant (p = 0.232). Finally, the test of the hypothesis 
H}!: 02 = 0 versus H/\: 02 > 0 gives the variance ratio of 2.19 which is again 
not significant (p = 0.184). However, note that the F test for vats has so few 
degrees of freedom that it may not be able to detect significant differences even 
if there really are important differences among them. Thus, we may conclude 
that although there seems to be significant variability among samples within 
bags and some variability between vats, there is no indication of any differ- 
ences among bags within vats. The estimates of the variance components are 


410 The Analysis of Variance 


TABLE 7.5 

Analysis of Variance for the Material Homogeneity Data of Table 7.4 

Source of Degrees of Sumof Mean Expected 

Variation Freedom Squares Square Mean Square F Value _p-Value 

Vats 5 66.250 13.250 o2 +207+2x20f 2.19 0.183 

+2x2x 202 

Bags 6 36.250 6.042 o2+207+2x20f 1.59 0.232 
(within vats) 

Samples 12 45.500 3.792 o7 +20) 4.55 <0.001 
(within bags) 

Error 24 20.000 0.833 ao? 

Total 47 168.000 

given by 


2 
a? l 
= Bee — 0.833) = 1.480, 
1 
35 — qo — 3.792) = 0.563, 
and 


1 
5? = g (13.250 — 6.042) = 0.901. 


These variance components account for 22.0, 39.2, 14.9, and 23.9 percent of 
the total variation in material content in this experiment. It is evident from this 
analysis that the batch of material under investigation is highly inhomogeneous 
and the larger part of this variability arises in bagging the material. The vari- 
ability between repeated analyses on a given sample is also quite large, and 
there also seem to be appreciable differences between contents of each vat. 
We further refine the preceding analysis by resorting to the method of pooling. 
In our earlier analysis we have seen that the variance component due to bags 
within vats (03) is not statistically significant. We can thus pool its mean square 
with the samples within bags mean square to get a new estimate of o2 + 207 
equal to (45.500 + 36.250)/18 = 4.542 with 18 degrees of freedom. Now, the 
hypothesis on the variance component due to vats (a7) is tested by the between 
vats mean square against the pooled value of the between samples within bags 
mean square. The variance ratio for the test is 13.250/4.542 = 2.92 with 5 
and 18 degrees of freedom, respectively. Note that in contrast to the unpooled 
analysis, this value is significant at the 5 percent level (p = 0.041). Finally, the 


Three-Way and Higher-Order Nested Classifications 411 


pooled estimates of variance components are now given by 


2 
1 
= 5 (4.542 — 0.833) = 1.855, 


of = 0, 
and 
1 
6? = ¢ (13.250 — 4.542) = 1.089. 


These variance components account for 22.1, 49.1, 0, and 28.8 percent of 
the total variation in the material. The results of pooled analysis are similar 
to the earlier analysis. Thus, it is seen that there is an appreciable variability 
between vats. The variability between bags from a given vat is not large enough 
to be statistically significant. The variability between samples from a given 
bag is extremely large and, in fact, may account for nearly half of the total 
variation. The variability between duplicate analyses of a given sample is also 
quite large, probably the second most important component of variability in the 
process. 


7.8 WORKED EXAMPLE FOR MODEL II: UNEQUAL NUMBERS 
IN THE SUBCLASSES 


Damon and Harvey (1987, p. 29) reported data from an experiment to determine 
the number of diatoms at different locations on a river. (The original data were 
supplied by Dr. Richard Larsen of the Department of Fisheries and Wild Life of 
the University of Massachusetts.) The experiment entailed determination of the 
number of diatoms at two randomly selected locations, two or three bricks at 
each location, and one or two slides attached to each brick. Thus, there are two 
hierarchies of nesting, bricks nested within locations and slides nested within 
bricks. The number of diatoms per square centimeter colonizing each glass slide 
were determined and the data are given in Table 7.6. 

The design structure follows a three-way nested or hierarchical classification 
and the mathematical model 1s 


b] 


LZ 

= ese 2. 
LZ 
Le 


ma (7.8.1) 
»+++5Nijk, 


b] 


Vijke = +0; + By) + Vez) + ein) : 
£ 


b) 


where yj;x¢ 1S the €-th observation on the k-th slide, on the j-th brick and at the 
i-th location. Here, the number of bricks in the i-th location is designated as 
b; (b; =3, by =2), the number of slides in the j(i)-th brick (location) subclass 


412 The Analysis of Variance 


TABLE 7.6 
Number of Diatoms per Square Centimeter Colonizing Glass Slides* 
Location 1 Location 2 
Brick 1 Brick 2 Brick 3 Brick 1 Brick 2 


Slide 1 Slide 2 Slide 1 Slide 2 Slide1 Slide 1 Slide2 Slide1 Slide 2 


102 500 142 119 243 500 822 826 642 


111 480 125 114 189 165 743 750 710 
400 112 221 461 362 752 263 682 720 
380 103 464 382 264 142 321 522 584 
210 225 510 921 620 650 650 
245 361 380 792 584 621 
842 871 841 
657 900 
Yijk. 1,448 3,280 952 1,966 1,058 5,043 4,194 4,051 3,306 
Nijk 6 8 4 6 4 8 7 6 5 
Vij. 4,728 2,918 1,058 9,237 7,357 
nij. 14 10 4 15 11 
yj 8,704 16,594 
n 28 26 


Source: Damon and Harvey (1987, p. 29). Used with permission. 


* Numbers have been coded to simplify computation. 


is designated as cjj (C11) =2, C12 =2, C13 = 1, C21} =2, C22 =2), and the num- 
ber of observations in the k(ij)-th slide (brick (location)) sub-subclass is desig- 
nated as nj jx (M111 = 6, N12 = 8, N21 =4, N122 = 6, 113) =4, N11 = 8, N22 =7, 
N22, = 6, N22 =5). Furthermore, in the model equation (7.8.1), jz 1s the general 
mean, a; is the effect of the i-th location, Bj,) is the effect of the j-th brick 
in the i-th location, yy(j) is the effect of the k-th slide on the j-th brick in the 
i-th location, and é¢(;x) 1s the customary error term. Finally, we will assume 
that all factors are random; that is, the a;’s, Bjcy’s, Yeajy’S, and eg jx)’S are 
independently and normally distributed with mean zero and variances o7, 07, 


2 2 
ie and o;, respectively. 


All the quantities needed for the analysis of variance computations outlined 
in Section 7.4 can be readily computed on an electronic calculator. The results 


are: 


Oo 


y? /N = (25,298)7/54 = 11,851,644.52, 
ab Cij Nijk 


Vege = 15,370,364, 


Three-Way and Higher-Order Nested Classifications 413 


a A yee, 1,448) — (3,280)? (3,306)” 
i=] joi kai [isk : 
= 13,457,673.97, 
3 Yi. _ (4,728? | 2,918) (7,357) 
Sony 14 10 = 
i=l j=l VY 


= 13,336,666.51, 


and 
y 2 y; _ @, lal n (16,594)? 
= 26 
= ue 
Thus, 
SS7 = 15,370,364 — 11,851,644.52 = 3,518,719.48, 
SS, = 13,296,501.96 — 11,851,644.52 = 1,444,857.44, 
SSa(a) = 13,336,666.51 — 13,296,501.96 = 40,164.55, 
SSccsy = 13,457,673.97 — 13,336,666.51 = 121,007.46, 
and 


SS¢_ = 15,370,364 — 13,457,673.97 = 1,912,690.03. 
The corresponding degrees of freedom are computed as: 


Total: N—-1=54—-—-1 =S3, 


Locations: a-—-1=2-1=1, 


Bricks (within locations): 2 b -a=5-2=3, 


Slides (within bricks): S* y Cis — . b =9-5=4, 


1 j= 
a 


Error: N-) ) ej =54-9 =45. 


i=1 j=1 


For appropriate tests of significance, we must evaluate the coefficients of the 
variance components in the expected mean squares, and determine the linear 
combinations of mean squares to be used as the denominator of an approxi- 
mate F statistic using Satterthwaite procedure. The basic quantities needed to 


414 The Analysis of Variance 


determine the coefficients of the variance components are computed as follows: 


a bh Cj 
N= DO ae +5=54, 
f=] 7=!1 k= 
yon 2 2 
na 28 26 
Se (28)" + (26)" — 27.0370, 
N 54 
»> mi (14)? + (10)? + --- + (11) 
ko = Sd 12.1852, 
N 54 
3 0 ni 
ij 
j=1 jal ke 6)2 Q 2 cee 5) 
ly = oe -o re TOY _ 63333, 
aun 6 (14)? +10)? + (4 = (15)? +11)” 
kg = sa eS 4 06. 
: dd, Nj. 28 r 26 
ni, OP + BP +. +4P BP +? +--+" 
— aL 
: yy Nj.. 28 . 26 
i=] j=l k=1 
= 12.6923, 
and 
ini (6) +(8) — (4)? + (6) (6)? + (5) 
— SUI ose cel ee Bg Seas 
: yyy ni Gg ee 
i jel-kel 
= 29.4217. 


Now, the coefficients of the variance components in the expected mean square 
column are given by 


N—ke 54—29.4217 


fi, = = 610416: 
c—b 4 
kg —ks 29.4217 — 12.6923 
5 5165. 
b—a 3 
N—k,  54—24.4506 
= = 0 808. 
b—a 3 
ks —k3 12.6923 — 6.3333 
ip SS 5G 3500: 
a—1 ] 
kg —ky ~~ 24.4506 — 12.1852 
5 19654. 


a— | ] 


Three-Way and Higher-Order Nested Classifications 415 


TABLE 7.7 
Analysis of Variance for the Diatom Data of Table 7.6 
Source of Degrees of Sum of Mean Expected 
Variation Freedom Squares Square Mean Square 
Locations 1 1,444,857.44 1,444,857.440 of + 6.359007 + 12.265403 
+ 26.963002 
Bricks (within 3 40,164.55 13,388.183 07 + 5.57650, + 9.84980, 
locations 
Slides (within 4 121,007.46 30,251.865 of? + 6.144607 
bricks) 
Error 45 1,912,690.03 42,504.223 oa? 
Total 53 3,518,719.48 
and 


N—-k, 7 54 — 27.0370 
a-1l | l 


= 26.9630. 


no = 


The results on sums of squares, mean squares, and expected mean squares are 
summarized in Table 7.7. 

The slides within bricks effects can be tested directly against the error mean 
square, giving F = 30,251.865/42,504.223 = 0.712 (p =0.588). The results 
are clearly nonsignificant. An approximate F test for bricks within locations can 
be obtained using the slides within bricks mean square, giving F = 13,388.183/ 
30,251.865 = 0.443 (p = 0.735). However, to use the Satterthwaite procedure, 
we first compute the coefficient as 


ly = nN2/n, = 5.5765/6.1446 = 0.9075, &,; = 1 — £2 = 0.0925; 
and the synthesized mean square 1s 
0.0925(42,504.223) + 0.9075(30,25 1.865) = 31,385.208. 


The number of degrees of freedom for the synthesized mean square (rounded 
to the nearest digit) is 


; (31,385.208)? 

iY SS eS eee 

[0.0925(42,504.223)]* — [0.9075(30,251.865)]? 

ee ees + a ae 

45 4 

The F ratio based on the synthesized mean square is F' = 13,388.183/31,385.208 
= 0.427 (p = 0.743), which gives nearly the same result as before; that is, 
bricks within location effects are also not significant. Similar procedures are 


used to test for location effects. For example, an approximate F test for location 
effects can be obtained using the bricks within location mean square, giving 


416 The Analysis of Variance 


F = 1,444,857.440/13,388.183 = 107.920(p = 0.002). To use the Satterth- 
waite procedure, the coefficients are 
5 = ns5/n3 = 12.2654/9.8498 = 1.2452, 
£4 = fig/A, — l5n2/ny 
= (6.3590/6.1446) — 1.2452(5.5765/6.1446) = —0.0952, 
3 = 1—£4—£5 = 1 — (—0.0952) — 1.2452 = —0.1500 


and the synthesized mean square 1s 


—0.1500(42,504.223) + (—0.0952)(30, 251.865) + 1.2452(13, 388.183) 
= 7,415.354. 


The number of degrees of freedom for the synthesized mean square (rounded 
to the nearest digit) 1s 


P (7,415.354) 
=e ee eee 
[—0.1500(42,504.223)]° - [—0.0952(30,251.865)}° A [1.2452(13,388.183)]° 


45 4 3 
= 2. 


The F ratio based on the synthesized mean square is 1 ,444,857.440/7,415.354 = 
194.847. Again, the results are highly significant (p < 0.001). 

The estimates of the variance components o2, Op, Oz, and a? are obtained 
as the solution to the following simultaneous equations: 


1,444,857.440 = 0? + 6.359007 + 12.265405 + 26.96300;, 
13,388.183 = 07 + 5.57650, + 9.849805, 
30,251.865 = 0, + 6.14460/, 


and 
42,502.223 = o?. 


Therefore, the desired estimates are given by 


32 = 42,504.223, 
30,251.865 — 42,504.223 

g2 OR = —1,994.004, 

4 6.1446 
.» _ 13,388.183 — 42,504.223 — 5.5765(—1,994.004) 

ae ee 877 09 

9.8498 
and 

a2 _ 1,444,857.440 — 42,504.223 — 6.3590(—1,994.004) — 12.2654(—1,827.091) 
mere 26.9630 


= 53,311.690. 


Three-Way and Higher-Order Nested Classifications 417 


The negative estimates are probably an indication that the corresponding vari- 
ance components may be zero. The point estimates of variance components 
are consistent with the results on tests of hypotheses. It is further evident from 
the analysis that the most of the variation in the number of diatoms is due to 
different location on the river. 


7.9 WORKED EXAMPLE FOR MODEL Ill 


Sokal and Rohlf (1995, p. 289) reported data from an experiment designed 
to analyze glycogen content of rat livers. For each of the three treatments — 
control, compound 217, and compound 217 plus sugar — used in the experi- 
ment, three preparations of rat livers from each of the two rats were analyzed 
and duplicate readings were made for each preparation. The data are given in 
Table 7.8. 

The design structure follows a three-way nested or hierarchical classification 
and the mathematical model is 


i 
Vijke = UL +0; + By + Vey + Ceajx : (7.9.1) 
£ 


where yjjxe 1S the £-th observation (reading) on the k-th preparation, on the 
j-th rat and for the i-th treatment, jz is the general mean, a; is the effect of 
the i-th treatment, 6j,;) is the effect of the j-th rat within the :-th treatment, 
and yx(jj) is the effect of the k-th preparation within the j-th rat within the 
i-th treatment, and é¢(;;x) 1s the customary error term. Furthermore, the a;’s are 
considered to be fixed effects with ee a; = 0, and the Bj(i)’s, Yeijy)’s, and 
€x(ijk) S are assumed to be independently and normally distributed with mean 
zero and variances 02, Oy, and Oe; respectively. 

Using the computational formulae for the sums of squares given in Section 
7.2, we have 


2 
SSr = (131)? + (130) +--- + (127)? — soos 
= 731,508 — 728,177.778 
= 3,330.222, 
$5, = (1,686)* + (1,812)? + (1,622) (5,120)? 
2x3x2 3x2x3x2 
= 729,735.333 — 728, 177.778 
= 1,557.555, 
(795) + (891) +---+(816)? (1,686)? + (1,812)? + (1,622)? 
SS$3(4) = —— SS - 
3x2 2X32 


= 730,533 — 729,735 .333 
= 797.667, 


The Analysis of Variance 


418 


uorsstusod yim pase) “(687 “d ‘C661) JTUOU Pure [eyog -aounog 


Z79'I Z18‘I 989'I “TA 

918 908 16 868 168 C6L “MK 

197 LLZ LZ ILZ OLZ 6SZ PIE PZ OE OO 96% CE OIE 8% B62 BLZ 96% 192 tA 
L7L 8€l Orl El Bel SZI ZSI Lpl ssl €Sl rl Spl OSI €rl Srl pl Szl_ O€I 

rel 6€1 El SEl El vel ZOl Lyl St Lrl vst LSt O91 Ol Ost O€1 IE IE€l ssuipeay 


€ c L € Cc L € c L € c L € c L € c L suonesedaid 


c L c L c L syed 


desns snd 217 punodwoD ZLLZ punodwoy jO1j}u0D 


sud} Rad] 


sup) Aaesyiquy ul sAaAry] }eY JO JUa}UOD UdSOdA|H 
8°Z J19VL 


Three-Way and Higher-Order Nested Classifications 419 


_ (261) + (256)? + +--+ (261)? — (795) + (891) + --- + (816)" 
7 2 7 3x2 

= 731,127 — 730,533 

= 594.000, 


SSca) 


and 


261)? + (256) +---+(261)" 
SS = (131)? + (1307 +--+ (127)? — ee 
= 731,508 — 731,127 


= 381.000. 


These results along with the remaining computations are summarized in 
Table 7.9. The test of the hypothesis Hy”: a, = 0 versus Hy: a, > O gives 
the variance ratio 2.34 which barely reaches its 5 percent critical value of 2.342 
(p = 0.050). The test of the hypothesis Hy: Op = ( versus He™: OR >0 
gives the variance ratio of 5.37 which clearly exceeds its 5 percent critical value 
of 3.49 (p = 0.014). Finally, the test of the hypothesis H;}: all a; = O versus 
H;': all a; 4 0 gives the variance ratio of 2.93 which is too low to reach 
its 5 percent critical value of 9.55 (p = 0.197). Thus, we may conclude that 
although there seem to be significant differences among preparations within 
rats and among rats within treatments, there is no indication of any differences 
between the treatments. However, note that the F test for treatments has so 
few degrees of freedom that it may not be able to detect significant differences 
even if there are really important differences among them. Perhaps repetition 
of the experiment using more rats per treatment is indicated. The estimates of 
variance components 07, OF, and Op are given by 


2 
I 
5? = (49.500 — 21.167) = 14.167, 


and 


1 
3g = (265.889 — 49.500) = 36.065. 


These variance components account for 29.6, 19.8, and 50.5 percent of the 
total variation in glycogen content in this experiment. It is evident from this 
analysis that the large part of the variability arises among rats within treatments. 
Readings within preparations and preparations within rats also seem to account 
for a significant portion of the total variability in the experiment. However, we 
cannot establish significant differences among treatments. 


The Analysis of Variance 


420 


70 LOVIZ 
0S0°0 pe? “07+ 70 00S 6h 
r10'0 LES fog+ foz+ 70 688597 

I-€ 
L610 €6°7 = +Hoo+ tov + 70 BLL'BLL 
<7 
C 
anjea-d anjea J aaenbs ueaw asenbs 
payedx ueaw 


CCT OLE'E 
000 T8¢ 


000 76S 


L99°L6L 


COOLEST 


sauenbs 
jo wing 


ct 
81 


wopaal4 
JO Saai3890q 


[BOL 
JO 
(sye1 UTYIIM) 
suoneiedaig 
(syusU}eo) 
uIyM) sey 


SJUSUNeII, 


UOHILA 
yO a94Nn0S 


Q°Z ajqey jo e}eG UaBODA]}H JY} 40} BDULLILA JO SISA;eUY 


62 AIEVL 


Three-Way and Higher-Order Nested Classifications 421 


7.10 USE OF STATISTICAL COMPUTING PACKAGES 


The use of SAS, SPSS, and BMDP programs for analyzing three- and higher- 
order nested factors is the same as described in Section 6.15 for the case of 
two-way nested designs. No new problems arise for analysis involving higher- 
order nested designs. 


7.11 WORKED EXAMPLES USING STATISTICAL PACKAGES 


In this section, we illustrate the application of statistical packages to perform 
three-way nested analysis of variance for the data sets employed in examples 
presented in Sections 7.7 through 7.9. Figures 7.2 through 7.4 illustrate the 
program instructions and the output results for analyzing data in Tables 7.4, 
7.6, and 7.8 using SAS GLM, SPSS GLM, and BMDP 3V/8V procedures. The 
typical output provides the data format listed at the top, all cell means, and 
the entries of the analysis of variance table. It should be noticed that in each 
case the results are the same as those provided using manual computations in 
Sections 7.7 through 7.9. However, note that in an unbalanced design, certain 
tests of significance may differ from one program to the other since they use 
different types of sums of squares. 


DATA INGREDIENT; 
INPUT VAT BAG SAMPLE 


The SAS System 
General Linear Models Procedure 
Dependent Variable: PERCENT 
Sum of 
Squares 
148.00000000 
-20.00000000 
168 .00000000 


Mean 
Square 
6.43478261 
0.83333333 


DF 
23 
24 
47 


Pr > F 
0.0001 


F Value 
7.72 


Source 
Model 
Error 
Corrected 
Total 
Root MSE PERCENT Mean 
0.91287093 28 . 50000000 
F Value Pr>F 
15.90 0.0001 
7.25 0.0002 
4.55 0.0008 


R-Square c.V. 

0.880952 3.2030559 
Source DF Type III SS Mean Square 
VAT 5 66.250000 13.250000 
BAG (VAT) 6 36.250000 6.041667 
SAMPLE (VAT* BAG) 12 45.500000 3.791667 
Source Type III Expected Mean Square 


CLASSES VAT BAG SAMPLE; 
f MODEL PERCENT=VAT BAG(VAT) 


VAT 
TEST H=VAT E=BAG(VAT) 


TEST H=BAG (VAT) BAG (VAT) 


Var (Error) +2 
+ 8 Var (VAT) 
Var (Error) +2 


Var (SAMPLE (VAT*BAG) ) +4Var (BAG (VAT) ) 


Var (SAMPLE (VAT*BAG) ) +4Var (BAG (VAT) ) 


E=SAMPLE (BAG VAT); 
RUN; 
#CLASS LEVELS VALUES 


SAMPLE (VAT*BAG) Var(Error) + 2 Var(SAMPLE (VAT*BAG) ) 


Tests of Hypotheses using the Type III MS for BAG(VAT) as an 
error term 

DF Type III Ss 

5 66.25000000 

DF Type III SS 

BAG (VAT) 6 36.25000000 


F Value Pr > F 
2.19 0.1834 
F Value Pr > F 
1.59 0.2316 


Mean Square 
13.25000000 
Mean Square 

6.04166667 


HNUMBER OF OBS. IN DATA 


(i) SAS application: SAS GLM instructions and output for the three-way random effects 
nested analysis of variance. 


FIGURE 7.2 Program Instructions and Output for the Three-Way Random Ef- 
fects Nested Analysis of Variance: Material Homogeneity Data for Example of 
Section 7.7 (Table 7.4). 


422 The Analysis of Variance 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: PERCENT 


| /VAT 1 : 

BAG 3 Source Type III SS df Mean Square F Sig. 
SAMPLE 5 VAT Hypothesis 66.250 5 13.250 2.193 .183 

PERCENT 7-8. Error 36.250 6 6.042(a) 
BAG (VAT) Hypothesis 36.250 6 6.042 1.593 .232 

Error 45.500 12 3.792 (b) 
SAMPLE (BAG Hypothesis 45.500 12 3.792 4.550 .001 

(VAT) } Error 20.000 24 0.833 (c) 


a MS(BAG(VAT)) b MS(SAMPLE(BAG(VAT))) c MS(ERROR) 


Expected Mean Squares (a,b) 
Variance Component 


GLM PERCENT 


Source Var(VAT) Var(BAG(VAT)) Var(SAMPLE(BAG)) Var(Error) 
BY VAT BAG VAT 8.000 4.000 2.000 1.000 
f SAMPLE. BAG (VAT) .-000 4.000 2.000 1.000 
}/DESIGN VAT SAMPLE (BAG (VAT) ) .000 .000 2.000 1.000 
BAG (VAT) Error .000 .000 -000 1.000 
SAMPLE (BAG a For each source, the expected mean square equals the sum of _ thef 
(VAT) ) coefficients in the cells times the variance components, plus a quadratic} 


/RANDOM VAT BAG |term involving effects in the Quadratic Term cell. b Expected Mean Squares 


are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the three-way random 
effects nested analysis of variance. 


FILE='C:\SAHAI\ BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
TEXTO\EJE19.TXT'. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 
VARIABLES22. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
/VARIABLE NAMES=D1, D2. 
/DESIGN NAMES=V,B,5S,D. SOURCE ERROR SUM OF D.F. MEAN F PROB. 
LEVELS=6,2,2,2. TERM SQUARES SQUARE 
RANDOM=V,B,.S,D. MEAN VAT 38988.00000 1 38988.000 2942.49 0.0000 
MODEL='V,B(V),S(B), B(V) 66.25000 5 13.250 2.19 0.1834 
D(S)'. B(V) S (VB) 36.25000 6 6.042 1.59 0.2316 
1 /END S(VB) D(VBS) 45.50000 12 3.792 4.55 0.0008 
729 29 D(VBS) 20.00000 24 0.833 
+ SOURCE EXPECTED MEAN ESTIMATES OF 
$29 31 SQUARE VARIANCE COMPONENTS 
HANALYSIS OF VARIANCE DESIGN MEAN 48(1)+8 (2)+4 (3)+2(4)+(5) 811.97396 
Vv B S D VAT 8(2)+4(3)+2 (4) +(5) 0.90104 
}NUM LEVELS 6 2 2 2 B(V) 4(3)+2 (4) +(5) 0.56250 
POPULATION INF INF INF INF S (VB) 2 (4)+(5) 1.47917 
| MODEL V, B(V), S(B), D(S) D(VBS) (5) 0.83333 


(iii) BMDP application: BMDP 8V instructions and output for the three-way random 
effects nested analysis of variance. 


FIGURE 7.2 (continued) 


Three-Way and Higher-Order Nested Classifications 423 


DATA DIATOMS; The SAS System 

INPUT LOCATION BRICK General Linear Models Procedure 
} SLIDE DIATOMS; Dependent Variable: DIATOMS 

DATALINES; Sum of Mean 
102 Source DF Squares Square F Value Pr>F 
111 Model 8 1606029.4 200753.7 4.72 0.0003 
400 Error 45 1912690.0 42504.2 
380 Corrected 53 3518719.5 
210 Total 
245 R-Square c.V. Root MSE DIATOMS Mean 
500 0.456424 44.00719 206.17 468.48 
480 Source DF Type I SS Mean Square F Value Pr >F 
112 LOCATION 1 1444857.4 1444857.4 33.99 0.0001 
103 Source DF Type I SS Mean Square F Value Pr >F 
225 BRICK (LOCATION) 3 40164.6 13388 .2 0.31 0.8144 
361 SLIDE (LOCATION*BRICK) 4 121007.5 30251.9 0.71 0.5882 
842 Source DF Type III SS Mean Square F Value Pr > F 
657 LOCATION 1 1483396.9 1483396.9 34.90 0.0001 
142 BRICK (LOCATION) 3 34822.4 11607.5 0.27 0.8445 
125 SLIDE (LOCATION*BRICK) 4 121007.5 30251.9 0.71 0.5882 
221 Source Type III Expected Mean Square 
. LOCATION Var(Error) + 5.591 Var (SLIDE (LOCATION*BRICK) ) 
650 + 10.271 Var(BRICK(LOCATION)) + 24.489 
Var (LOCATION) 

PROC GLM; BRICK (LOCATION) Var(Error) + 5.4151 Var (SLIDE (LOCATION*BRICK) ) 
CLASSES LOCATION BRICK + 9.6922 Var (BRICK (LOCATION) ) 

SLIDE; SLIDE (LOCATION*BRICK) Var(Error) + 6.1446 Var (SLIDE (LOCATION*BRICK) ) 
H MODEL DIATOMS=LOCATION Source: LOCATION - Error: 1.0598*MS(BRICK(LOCATION)) - 

BRICK (LOCATION) 0.024*MS (SLIDE (LOCATION*BRICK)) - 0.0357*MS (Error) 

SLIDE (BRICK LOCATION); Denominator Denominator 

RANDOM LOCATION DF Type III MS DF MS F Value Pr>F 
1 BRICK (LOCATION) 1 1483396.9269 2.00 10055.801769 147.5165 0.0067 
SLIDE (BRICK LOCATION) / Source: BRICK(LOCATION) Error:0.8813*MS (SLIDE (LOCATION* BRICK) } 
TEST; +.1187*MS (Error) 

RUN; Denominator Denominator 

CLASS LEVELS VALUES DF Type III MS DF MS F Value 

LOCATION 2 3 11607 .477802 5.64 31706.429974 0.3661 
| BRICK 3 Source: SLIDE(LOCATION*BRICK) Error: MS (Error) 
H SLIDE 2 Denominator Denominator 


NUMBER OF OBS. IN DATA DF Type III MS DF MS F Value 
SET=54 30251.865341 45 42504.222937 0.7117 


NNNFP PEP RPP BP PB RP BBP PP 
PRPRPNNNNNNNNRFP PRP RPE 


1 
1 
1 
1 
#1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
V1 


N 
Ne 


(i) SAS application: SAS GLM instructions and output for the three-way random effects 
nested analysis of variance with unequal numbers in the subclasses. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: DIATOMS 
/LOCATION 1 
BRICK 3 Source Type III SS df Mean Square F Sig 
SLIDE 5 LOCATION Hypothesis 1483396.927 1 1483396.927 147.517 .007 
DIATOMS 7-9. Error 20086. 860 1.998 10055.802 (a) 
DATA. BRICK (LOC) Hypothesis 34822 .434 3 11607.478 -366 .781 
102 Error 178807.755 5.639 31706. 430 (b) 
111 SLIDE(BRICK Hypothesis 121007.461 4 30251.865 -712 .588 
400 (LOCATION)) Error 1912690.000 45 42504.229(c) 
380 a 1.060 MS(B(L))-2.403E-02 MS(S(B(L)))-3.572E-02 MS(E) b .881 MS (S(B(L)))+ 
‘ . .119 MS(E) c MS(E) 
650 
DATA. Expected Mean Squares (a,b) 
DIATOMS Variance Component ‘+ 
Source Var (LOC) Var (B (LOC) ) Var (S(B) Var (Error) 
BRICK SLIDE LOCATION 24.489 10.271 5.591 1.000 
j /RANDOM LOCATION | BRICK (LOCATION) -000 9.692 5.415 1.000 
BRICK SLIDE SLIDE (BRICK (LOCATION) ) -000 -000 6.145 1.000 
/ DESIGN Error -000 -000 .000 1.000 
1 LOCATION a For each source, the expected mean square equals the sum of the | 
BRICK(LOCATION) | coefficients in the cells times the variance components, plus a quadratic] 
SLIDE (BRICK term involving effects in the Quadratic Term cell. b Expected Mean Squares 
(LOCATION) ) . are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the three-way random 
effects nested analysis of variance with unequal numbers in the subclasses. 


FIGURE 7.3. Program Instructions and Output for the Three-Way Random Ef- 
fects Nested Analysis of Variance with Unequal Numbers in the Subclasses: Diatom 
Data for Example of Section 7.8 (Table 7.6). 


} / INPUT 


/VARIABLE 


FILE='"C: \SAHAI 


\TEXTO\EJE20.TXT'. 


FORMAT=FREE. 
VARIABLES=4. 


NAMES=LOCATION, BRICK, 


SLIDE, DIATOM. 


CODES (LOCATION) =1, 2. 
NAMES (LOCATION) =L1, L2. 
CODES (BRICK)=1, 2,3. 
NAMES (BRICK) =B1,B2,B3. 
CODES (SLIDE)=1, 2. 
NAMES (SLIDE) =S1,S2. 
DEPENDENT=DIATOM. 


RANDOM=LOCATION. 


RANDOM=LOCATION, BRICK. 


RANDOM=BRICK, 


SLIDE. 


RNAMES=L,'B(L)','S(B)°. 


METHOD=REML. 


The Analysis of Variance 


BMDP3V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
Release: 7.0 (BMDP/DYNAMIC) 


DEPENDENT VARIABLE DIATOM 


PARAMETER ESTIMATE STANDARD EST/ 


ERROR ST. DEV. 


TWO-TAIL PROB. 
(ASYM. THEORY) 


ERR. VAR. 
CONSTANT 
LOCATION 
BRK (LOC) 
SLD (BRK) 


39881.962 7821.496 
474.377 163.687 
52107.607 75783.658 
0.000 0.000 
0.000 0.000 


2.898 0.004 


TESTS OF FIXED EFFECTS BASED ON ASYMPTOTIC VARIANCE 
-COVARIANCE MATRIX 
SOURCE 


F-STATISTIC DEGREES OF 


FREEDOM 


PROBABILITY 


CONSTANT 


(iii) BMDP application: BMDP 3V instructions and output for the three-way random 
effects nested analysis of variance with unequal numbers in the subclasses. 


FIGURE 7.3 (continued) 


| CONTROL 
| CONTROL 
J CONTROL 
| CONTROL 
| CONTROL 
CONTROL 
CONTROL 
| CONTROL 
} CONTROL 
i CONTROL 


| C217SUG 


1 PREPARAT 


| TEST; 

7 RUN; 

| CLASS 
TREATMNT 


| RAT 
| PREPARAT 


| SET=36 


NNNNFPRPRP PPB 


2 


1 PROC GLM; 
CLASSES TREATMNT RAT 


e 
’ 


DATA GLYCOGEN; 
INPUT TREATMNT $ RAT 
PREPARAT GLYCOGEN; 
DATALINES; 


131 
130 
131 
125 
136 
142 
150 
148 
140 
143 


NOR RP WWNNP PB 


127 


We 


H MODEL GLYCOGEN=TREATMNT 
PREPARAT (RAT TREATMNT) ; 


RANDOM RAT (TREATMNT) 
| PREPARAT (RAT TREATMNT) / 


LEVELS 


3 


C217SUG 


2 
3 


| NUMBER OF OBS. 


Source 
Model 
Error 


Total 


Source 
TREATMNT 


Source 
TREATMNT 


Source: 


DF 
2 
Source: 


DF 
3 
Source: 


DF 


nested analysis of variance. 


Dependent Variable: GLYCOGEN 


Corrected 


RAT (TREATMNT) 3 
PREPARAT (TREATMNT* RAT) 12 


RAT (TREATMNT ) 


PREPARAT (TREATMNT*RAT) Var(Error) + 2 Var (PREPARAT (TREATMNT*RAT) ) 
Tests of Hypotheses for Mixed Model Analysis of Variance 
TREATMNT Error: MS(RAT(TREATMNT) ) 


Type III MS’ DF MS 
778.77777778 3 
RAT (TREATMNT) 


Denominator Denominator 
Type III MS DF MS F Value Pr > F 
265.88888889 12 49.5 5.3715 0.0141 


PREPARAT (TREATMNT*RAT) Error: MS(Error) 


Type III MS DF MS 


The SAS System 
General Linear Models Procedure 


Sum of Mean 
DF Squares Square F Value Pr>F 
17 2949.22222 173. 48366 8.20 0.0001 
18 381.00000 21.16667 


35 3330.22222 


R-Square c.V. Root MSE GLYCOGEN Mean 
0.885593 3.234884 4.60072 142.222 


DF Type III SS Mean Square F Value Pr > F 


2 1557.55556 778.77778 36.79 0.0001 
797.66667 265.88889 12.56 0.0001 
594.00000 49.50000 2.34 0.0503 


Type III Expected Mean Square 

Var (Error) + 2 Var(PREPARAT (TREATMNT*RAT) ) 
+ 6 Var(RAT(TREATMNT)) + Q(TREATMNT) 
Var(Error) + 2 Var (PREPARAT (TREATMNT*RAT) } 
+ 6 Var (RAT (TREATMNT) ) 


Denominator 

F Value Pr > F 
265.88888889 2.9290 0.1971 
Error: MS (PREPARAT (TREATMNT*RAT) ) 


Denominator 


Denominator 
F Value Pr>F 
2.3386 0.0503 


Denominator 


18 21.16666666 


FIGURE 7.4 Program Instructions and Output for the Three-Way Mixed Effects 
Nested Analysis of Variance: Glycogen Data for Example of Section 7.9 (Table 7.8). 


Three-Way and Higher-Order Nested Classifications 425 


} DATA LIST 


Tests of Between-Subjects Effects Dependent Variable: GLYCOGEN 
| /TREATMNT 1 


RAT 3 Source Type III ss df Mean Square F Sig. 
PREPARAT 5 TREATMNT Hypothesis 1557.556 2 778.778 2.929 .197 
GLYCOGEN 7-9. Error 797.667 3 265.889 (a) 
| BEGIN DATA. RAT (TREATMN) Hypothesis 797.667 3 265.889 5.371 .014 
111131 Error 594.000 12 49.500 (b) 
111 #130 PREPARAT (RAT Hypothesis 594.000 12 49.500 2.339 .050 
112131 (TREATMNT)) Error 381.000 18 21.167 (c) 
112 125 a MS(RAT(TREATMNT)) b MS (PREPARAT (RAT (TREATMNT) ) 
Be can 43 : c MS(E) 
13 2 3 127 


DATA. Expected Mean Squares (a,b) 
Variance Component 

Source Var(R(T)) Var(P(R(T))) Var(Error) Quadratic Term 

TREATMNT 6.000 2.000 1.000 TREATMNT 


RAT (TREATMNT) 6.000 2.000 1.000 


TREAMNT PREPARAT (RAT (TREATMNT) ) -900 2.000 1.000 
RAT (TREATMNT ) Error 000 .000 1.000 
PREPARAT (RAT a For each source, the expected mean square equals the sum of the 
(TREATMNT) ). coefficients in the cells times the variance components, plus a quadratic} 


j /RANDOM RAT term involving effects in the Quadratic Term cell. b Expected Mean Squares| 


PREPARAT. are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the three-way mixed 
effects nested analysis of variance. 


FILE='C:\SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE21.TXT’. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT™FREE. 
VARIABLES=2. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
NAMES=R1, R2. 
NAMES=T,R, P, D. SOURCE ERROR SUM OF D.F. MEAN F PROB. 
LEVELS=3,2,3,2. TERM SQUARES SQUARE 
RANDOM=R, P, D. MEAN R(T) 728177.778 728177.78 2738.65 0.0000 
FIXED=T. TREATMNT R(T) 1557.556 778.78 2.93 0.1971 
MODEL='T,R(T),P(R), R(T) P(TR) 797.667 -89 5.37 0.0141 
D(P)'. P(TR) D(TRP) 594.000 -50 2.34 0.0503 
| /END D(TRP) 381.000 17 
7131 130 
i . SOURCE EXPECTED MEAN ESTIMATES OF 
134 127 SQUERE VARIANCE COMPONENTS 
ANALYSIS OF VARIANCE DESIGN MEAN 36(1)+6(3)+2(4)+(5) 77469 
INDEX T R P D TREATMNT 12 (2)+6(3)+2(4)+(5) - 74074 
| NUM LEVELS 3 2 3 2 R(T) 6(3)+2(4)+(5) -06481 
iPOPULATION SIZE 3 INF INF INF|4 _ P(TR) 2(4)+(5) - 16667 


j MODEL T, R(T),P(R),D(P) D(TRP) (5) . 16667 


Sas a eae ae eee Soar 


(iii) BMDP application: BMDP 8V instructions and output for the three-way mixed 
effects nested analysis of variance. 


FIGURE 7.4 (continued) 


EXERCISES 


1. Anexperiment is performed to investigate alloy hardness using a three- 
way nested design having two fixed alloys with different chemistries, 
three heats within each alloy, two ingots within each heat, and two 
determinations are made on each ingot. The data in certain standard 
units are given as follows. 


426 


The Analysis of Variance 


(a) Describe the model and the assumptions for the experiment. It is 
assumed that alloys and heats are fixed and ingots are random. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in mean hardness levels be- 
tween alloy chemistries. Use a = 0.05. 

(d) Test whether there are differences in mean hardness levels be- 
tween heats within alloys. Use a = 0.05. 

(e) Test whether there are differences in mean hardness levels be- 
tween ingots within heats. Use a = 0.05. 

(f) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 


2. Achemical company wishes to examine the strength of a certain liquid 


Vat 


chemical. The chemical is made in large vats and is then barreled. A 
random sample of three different vats is selected, three barrels are 
selected at random from each vat, and then two samples are taken for 
each barrel. Finally, two independent measurements are made on each 
sample. The data in certain standard units are given as follows. 


Barrel 1 2 3 1 2 3 1 2 3 


Sample 1 2 1 2 1 2 1 2 717 2 71 2 1 2 71 2 7 2 


43 40 43 46 4.7 49 48 4.6 4.7 45 43 45 5.0 5.3 5.1 5.0 5.0 5.1 
4145 45 44 44 43 4.7 4.5 4.5 4.7 4.7 5.1 4.8 5.2 48 5.2 4.7 4.9 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in mean strength levels between 
vats. Use a = 0.05. 

(d) Test whether there are differences in mean strength levels between 
barrels within vats. Use a = 0.05. 

(e) Test whether there are differences in mean strength levels between 
samples within barrels. Use a = 0.05. 


Three-Way and Higher-Order Nested Classifications 427 


(f) Estimate the variance components of the model and determine 95 
percent confidence interval on them. 

3. Consider an experiment designed to study heat transfer in the molds 
utilized in manufacturing household plastics. A company has two 
plants that manufacture household plastics. Two furnaces are randomly 
selected from each plant and two molds are drawn from each furnace. 
The response variable of interest is the mold temperature, and five 
temperatures are recorded from each mold. The data from test results 
of the experiment are given as follows. 


Plant 1 2 
Furnace 1 2 1 2 
Mold 1 2 1 2 1 2 1 2 


Temperature (°C) 468 473 474 475 481 481 480 480 


(a) Describe the model and the assumptions for the experiment. It 
is assumed that the effect due to plant is a fixed effect whereas 
furnace and mold are random factors. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in mean temperature levels 
between the plants. Use a = 0.05. 

(d) Test whether there are differences in mean temperature levels 
between furnaces within plants. Use a = 0.05. 

(e) Test whether there are differences in mean temperature levels 
between molds within furnaces. Use a = 0.05. 

(f) Estimate the variance components of the model and determine 95 
percent confidence intervals on them. 

4. Bliss (1967, p. 354) reported data from an experiment designed to 
investigate variation in insecticide residue on celery. The experiment 
was carried out on 11 randomly selected plots of celery which were 
sprayed with insecticide and residue was measured from plants se- 
lected in three stages. Three samples of plants were selected from 
each plot and one or two subsamples were selected from each sample. 
Finally one or two independent measurements on residue were made 
on each subsample. The following data refer to a subset of 6 plots that 
have been randomly selected from 11 plots in the experiment. 


428 The Analysis of Variance 


Plot 1 2 3 
Sample 1 2 3 1 2 3 1 2 3 


Subsample 1 2 1 2 141 134 2 1% 2 1% 1% 2 1 #2 «1 


Residue 0.52 040 0.26 0.54 0.52 0.18 0.31 0.13 0.25 0.10 0.52 0.55 0.33 0.26 0.41 
0.43 0.52 0.24 0.29 0.66 0.40 


Plot 4 5 6 
Sample 1 2 3 1 2 3 1 2 3 


Subsample 1 2 1 2 141 134 2 131% 2 7% 3% 2 134 2 «21 


Residue 0.77 0.51 0.44 0.50 0.44 0.50 0.60 0.60 0.71 0.92 0.24 0.48 0.53 0.50 0.39 
0.56 0.60 0.67 0.53 0.36 0.30 


Source: Bliss (1967, p. 354). Used with permission. 


(a) Describe the model and the assumption for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in mean residue levels between 
plots. Use a = 0.05. 

(d) Test whether there are differences in mean residue levels between 
samples within plots. Use a = 0.05. 

(e) Test whether there are differences in mean residue levels between 
subsamples within samples. Use a = 0.05. 

(f) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 

5. Anderson and Bancroft (1952, p. 333) reported the results of an exper- 
iment designed to study some of the factors affecting the variability 
of estimates of various soil properties. The experiment was conducted 
on 20 fields by sampling two sections from each field. Two samples 
consisting of a composite of 20 borings were taken from each section 
and finally two subsamples were drawn from each sample. The data 
were analyzed for several soil properties and the following table gives 
an analysis of variance for the magnesium data. 


Analysis of Variance for the Magnesium Data 


Source Degrees Mean Expected Mean 
of Variation of Freedom Square Square 
Field 0.1809 
Section 0.0545 

(within fields) 
Sample | 0.0080 

(within sections) 
Subsample 0.0005 


(within samples) 
Source: Anderson and Bancroft (1952, p. 333). Used with permission. 


(a) State the model and the assumptions for the experiment. 
(b) Complete the missing columns of the preceding analysis of vari- 
ance table. 


Three-Way and Higher-Order Nested Classifications 429 


(c) Test whether there are differences in mean levels of magnesium 
between samples within sections. Use a = 0.05. 

(d) Test whether there are differences in mean levels of magnesium 
between sections within fields. Use a = 0.05. 

(e) Test whether there are differences in mean levels of magnesium 
between fields. Use a = 0.05. 

(f) Estimate the variance components of the model and determine 95 
percent confidence intervals on them. 

6. Anderson* and Bancroft (1952, pp. 334-335) described an experiment 
designed to test various molds for their efficacy in the manufacturing 
of streptomycin. A trial experiment to assess variability at various 
stages of production process is to be run. There are five stages in 
the production process: The initial incubation stage in a test tube, 
a primary inoculation period in a petridish, a secondary inoculation 
period, a fermentation period in a bath, and the final assay of the 
quantity of streptomycin produced. The number of test tubes to be 
used at different stages are as follows: a = 5, b = 2,c = 2, d = 2, 
and n = 2, giving a total of 80 assays for the final analysis. Let o2, 
Opa): oF 8)» Ose) and oZ be the variance components associated with 
the five stages of the production process; and consider the following 
analysis of variance table. 


Analysis of Variance for the Streptomycin Production Data 


Source Degrees of Mean Expected 
of Variation Freedom Square Mean Square 
Incubation stage MS, 
Primary inoculation MS 2a) 

(within incubation stage) 
Secondary inoculation MSc:z) 


(within primary inoculation) 

Fermentation MS pc) 
(within secondary inoculation) 

Final assay MS<£,p) 
(within fermentation) 

Error MSE 


(a) State the model and the assumptions for the experiment consid- 
ering all effects are random. 

(b) Complete the missing columns of the preceding analysis of vari- 
ance table. 

(c) Determine algebraic expressions for the estimates of the variance 
components as functions of the mean squares. 


* Dr. R.L. Anderson first proposed this design for an experiment conducted at the Purdue University 
in 1950. Used with permission. 


8 Partially Nested 
Classifications 


8.0 PREVIEW 


In the preceding chapters, we discussed classification models involving several 
factors that are either all crossed or all nested. Occasionally, in a multifac- 
tor experiment, some factors will be crossed and others nested. Such designs 
are called partially nested (hierarchical), crossed-nested, nested-factorial, or 
mixed-classification designs. For example, suppose that in a study involving an 
industrial experiment it is desired to test three different methods of a produc- 
tion process. For each method, five operators are employed. The experiment 
is carried out over a period of four days and three observations are obtained 
for each combination of method, operator, and day. Because of the nature of 
the experiment, the five operators employed under Method I are really individ- 
uals different from the five operators under Method II or Method III, and the 
five operators under Method II are different from those under Method III. The 
physical layout of such an experiment can be depicted schematically as shown 
in Figure 8.1. In this experiment, the days are crossed with the methods and 
operators, and operators are nested within methods. 


8.1 MATHEMATICAL MODEL 


Consider three factors A, B, and C having a, b, and c levels, respectively. Let 
b levels of factor B be nested under each level of A and let the c levels of factor 
C be crossed with a levels of factor A and b levels of factor B. The model for 
this type of experimental layout can be written as 


Vigne = UG + By t+ V+ OV dix 


i 
+ (BY) juin + Cecijey ; (8.1.1) 
£ 


II 
— — — 
i) NN i) 
= A @Q 


where yz is the general mean, a; is the effect due to the i-th level of factor A, 
Bq) 1s the effect due to the j-th level of factor B within the i-th level of factor 
A, Yx 1S the effect due to the k-th level of factor C, (wy ),, 1s the interaction of the 


H. Sahai et al., The Analysis of Variance 431 


CY Ria et Gian Re as aa ee. Moles: Nektar Vinee ONG 
© Springer Science+Business Media New York 2000 


432 The Analysis of Variance 


pays 
2 3 4 


1q11 
2(IID) 
3(1I1) 
411) 
SII) 


FIGURE 8.1 A Layout for the Partially Nested Design Where Days Are Crossed 
with Methods and Operators Are Nested within Methods. 


i-th level of factor A with the k-th level of factor C, (By) ;,,;) 1s the interaction 
of the j-th level of factor B with the k-th level of factor C within the i-th level 
of factor A, and e¢(;,) 18 the usual error term. Notice that no A x B interaction 
can exist, because the levels of factor B occur within different levels of factor 
A. Similarly, there can be no three-way interaction A x B x C. 

Under Model I, the @;’s, Bjiy’s, (@);,’s, and (By ) ;,,;)’s are constants subject 
to the restrictions: 


Se = Ve = 0, 

i=l k=1 
Yd @y)ix = YS @v iz = 0, 
i=1 k=1 


i 
>> Bia = 0 for each i, 


j=1 


. 


b 
(BY) jxg) =O for each (i, k), 
j=) 


>_ BY)ixiy = 9 for each j(i), 
k=1 


Partially Nested Classifications 433 


and the ég;;,)’s are uncorrelated and randomly distributed with mean zero and 
variance o2. However, for a fixed k, the (By) jk S do not sum to zero over i 
for a fixed /. 

Under Model II, we assume that the a;’s, Bj(iy’S, Y's, (@Y )ix’Ss (BY) jxqiy 8 
and €¢(; ;x)'S ad uncon e and randomly distributed with zero meals ang vari- 
ances a, O Bia)? oy, C605 By(ay» and o2, respectively. Thus, 7, O5ia)> oy, Cae 

G5 Hays and o? are the variance components of the model (8.1.1). 

Various types of mixed models are possible and their assumptions can ana- 

logously be stated. For example, with A and C fixed and B random, we assume 


that a;’s, y,’s, and (ay);,’s are constants subject to the restrictions: 
a c 
Ya; = iS Ve = 9, 
i=1 k=l 
a Cc 
> (ay), = > (ay), = 0. 


i=] k=] 


Furthermore, the Bjqiy’s, (BY) jxq'S, and eecijxy’S ate randomly distributed with 
zero means and variances 04,4); OBy(a)> and o2, respectively; and the three 
groups of random variables are pairwise uncorrelated. The random effects 
(By); jk(i) ’s, however, are correlated due to the restrictions: 


Cc 
> (BY) ici) = 0, for each Ji). 
k=1 


8.2 ANALYSIS OF VARIANCE 


The identity corresponding to the model (8.1.1) is 


Yijke — Y... = Ci... — YW) + Giy.. — Vind + Ou — Y.z..) 
OVE = i= VRE YL) ORR iS JR VED 
+ (vijke — Yijx.): (8.2.1) 


Note that the terms on the right-hand side of (8.2.1) are the sample estimates of 
the terms on the right-hand side of the model (8.1.1) excluding the grand mean. 
The first and third terms are similar to main effects in a crossed classification 
model (5.1.1). The second term is analogous to an ordinary nested term such as 
the second term in (6.3.1). The fourth term is an ordinary two-way interaction 
similar to the fifth term of (5.3.1). The fifth term can be obtained by considering 
it as the difference between yj;j;,, and the term obtained as the general mean 
y... + the factor A effect (i.e., y;.. — y...) + the factor B within A effect (i.e., 


434 The Analysis of Variance 


Vij. — yi...) + the factor C effect (i-e., ¥.%. — ¥....) + the A x C interaction (..e., 
Vik. — Vi... — Yk. + Y....); that 1s, 


Vise = Wie One Oe = i PO A) 
+ (Vir. — Vi — Yi. + Y.DI 
= Vik =i = ie as (8.2.2) 


Alternatively, partially hierarchical models can be looked upon as degenerate 
cases of completely crossed models. For example, suppose that the B effect is 
fully crossed with A, so that there will be a B main effect yy; — y.... and 
an A x B interaction y;;,. — yj... — ¥;.. + ¥.... Now, noting that the B effect 
is not really a main effect and combining it with its interaction with A, we 
obtain 


V9) OR. Sie SH ie) = is Sis (8.2.3) 


which is precisely the second term on the right-hand side of (8.2.1). Simi- 
larly, if B were a crossed effect, then it would have an interaction with C, 
and its interaction with A would also have an interaction with C. But since 
B is not really a crossed effect, these two interactions are combined to ob- 
tain 


Vj = 95> Va FY.) 
+ OR Ji = GEIS OD) 
= Vijr. — Vij. — Vir. + Yi... (8.2.4) 
which is equivalent to (8.2.2). 
The same reasoning also holds in the determination of the degrees of freedom. 
For the B within A effect, each level of A contributes b — 1 degrees of freedom 
and since there are a levels of A, the total number of degrees of freedom are 


a(b — 1). However, using the argument of (8.2.3), the degrees of freedom would 
be 


(b—1)+(a—- 16-1) =a(b—- 1), (8.2.5) 

which gives exactly the same value. For the B x C within A interaction, since C 

has c — 1 degrees of freedom and B within A has a(b — 1) degrees of freedom, 

their interaction will have a(b—1)(c—1) degrees of freedom. From the argument 
of (8.2.4), the degrees of freedom will be 

(b— 1)\(c —1)+ (a— 1b- 1)(e -— 1) = ab —- 1c — 1), (8.2.6) 


which again gives the same result. 


Partially Nested Classifications 435 


Now, performing the operations of squaring and summing over all indices of 
(8.2.1), we obtain the following partition of the total sum of squares: 


SS; = SS, + SS 3(A) + SSc + SSac + SSBc(A) + SSz, 


where 
a b Cc n 
SS7 = > (vijke — ¥....)" 
t=t. Jal k=! f=) 
SS4 = ben ) (i... — 5...) 
=A 
SSpcay = cn > Y0y.- Vi)’ 
i=l] j= 
SSc = abn or — yy, 
k=1 
SSac = bn > Six. — Vi. - Vat, 
i=] k=1 
SSBcca) = 72 Ou - Viz. — Via. + Hi... 
f=1 jal k= 
and 


a b 
SS-_ = 2 » (yijne — Vijkd’- 


The corresponding mean squares are denoted by MS4, MSa.a), MSc, MSac, 
MSzgcia), and MS; respectively. The expected values of mean squares can 
be derived as before. Bennett and Franklin (1954, pp. 410-427) give a gen- 
eral procedure for obtaining the expected values in partially nested classifi- 
cations. The resultant analysis of variance is shown in Table 8.1. The proper 
test statistic for any main effect or interaction of interest can be obtained from 
an examination of the analysis of variance table. The variance components 
estimates are obtained by equating mean squares to their respective expected 
values and solving the resultant equations for the corresponding variance com- 
ponents. 


Remark: In a partially nested situation, it is useful to remember the following rule of 
thumb for calculating the degrees of freedom. The number of degrees of freedom for a 
crossed-factor is one less than the number of levels of the factor; for a nested factor the 
number of degrees of freedom is equal to the product of the quantity above multiplied 
by the number of levels of all the factors within which it is nested. 


» 


(wn) Ag a 
Fou + 70 
1=7[=! 


U(An) <x x x 


2 


The Analysis of Variance 


an ie 4d oy Pe 2 


wopuey g ‘paxi4 > pue Vy 
Il l2POW 


» 


(n)Ag a 
pou + 70 


A 
‘Pouq Ajo) fou ao 70 


A A A 
pougn + “Poug + (®) fou + 70 
fous + eros + 70 
70ung + 


n 


)doua + “Poug + “Fou + 20 


wopuey 2 pue ‘g ‘y 
Il lSPOW 


aaenbs ueaw payedx 


a 
z? 


r I= 1=f 1=! 
OV (Ag) < a Ze x 

I q oD 

( — 2) — 9)P 


MAny << ZA =) +. to 
2 D 


paxiy > pure ‘g ‘y 
1 }PPOW 


ISW 


WIISW 


IVSIN 


ISW 


WAIST 


YSN 


aaenbs 
uraw 


Ss 
7Ss 


(WDESS 


IVSS 
ISs 


(Wass 


YSs 


sauenbs 
jo wins 


udqv [ROL 
(I — 4)9qD JOLq 
(V UryiIM) 
(I-29)\1-92 Ox gorend 
(I-21 -?) 9 x yorend 
I-? 3 01 ong 
(V uly) 
(I — 9)? g oj ong 
ea VY 0) ond 
wopaal4 UOILLUPA 
JO $aa139q JO 3d1N0S 


(L' LS) |BPOW 404 aduRLARA Jo sisAyeuY 


8 d1aVi 


436 


Partially Nested Classifications 437 


8.3 COMPUTATIONAL FORMULAE AND PROCEDURE 


The following formulae may be used for calculating the sums of squares: 


a Cc 


b 
SST = o> Si nae 


i=1 j=l k=1 l=1 


ls ee y? 
SS —— ee 2 —_ sieere : 
A ben 8 Yi... abcn 


1 a b 1 a 
SSin = Fe eee e % 
B(A) pay. Yij.. ben ae 
i=l j=l i=1 
e+ ee y? 
CC ce oF es ge 
- abn Zz Yk. abcn 


] = ; 
Sac = Fe De ag Le ape kt a 


i=] k=1 k=1 


1 b oe c¢ 
SSacca) = = D> vie ~ =LL%. md Yee. 


i=1 j=l k=1 Wet t= 


a 


and 


b 
1 
SSse=) Yijke ~ = 


Examining the forms of SSg,4) and SSgcva), we notice that these formulae can 
be written as 


a 


1 ~Z 1 
SS = pace Pe eye Ge 
B(A) y 13 pe ag ee | 


=| 


and 


ae 1 be l 
SSac(a) = > : > Nive a oe Yin. ba > Yin + it. 


i=] j=l k=1 j=l k=1 


q 


Thus, SSg 4) can be obtained by first calculating the sums of squares among 
levels of B for each level of A, then pooling over all levels of A; and SSgcva) 
can be obtained by first computing the sums of squares among levels of B and C 
for each level of A, then pooling over all levels of A. Their degrees of freedom 
can also be determined in a similar manner. 


438 The Analysis of Variance 


8.4 A FOUR-FACTOR PARTIALLY NESTED CLASSIFICATION 


In this section, we briefly outline the analysis of variance for a four-factor 
partially nested classification. Consider four factors A, B, C, and D having 
a, b, c, and d levels, respectively. Let b levels of factor B be nested under 
each level of A, let c levels of factor C be nested under each level of B, and 
let d levels of factor D be crossed with a levels of A, b levels of B, and c 
levels of C. The model for this type of experimental layout can be written 
as 


a 
F=1,...,D 
ijkem = bh + 0; + Byay + Yeajy + be + (dig J JO 
aa LOE RG ENCE Net e AT) 
+ (BS) pe) + VO )gecizy + Cmcijke) e d 
n 


where the meaning of each symbol and the assumptions of the model are readily 
stated. Starting with the identity 


Yijkem — Y.... = (Wi... — ¥) + iz... — Vi.) + Dijk. — Vij...) 
+ (V0. = Wud A it. — Vi + Ye $Y...) 
+ (Vij.t. — Vij... — Vie. + Yi...) 
+ (Vijke. — Vijk.. — Vij.e. + ij...) 
+ (Vijkem — Yijxe.)s 


the total sum of squares is partitioned as 
SS7 = SS4 + SSaay + SSccay + SSp + SSav + SSBp(A) + SScr(B) + SSE, 


where 


ooeee 


SS3B(A) = cdn s Y Wir. - Vi.) 
a bee 


SSca) =dn > ye > Ciik. aay Vij. 


i=] j=l k=1 


Partially Nested Classifications 439 


l=] 
a d 


= 
a b d 
SSppca) = en » Y_Cize. — ij. — Fit +5. 
2 Fatal 
a b Cc d 
SScp(s) =n » > Y(iine — ijk. — Vije. + ij.) 


i=] j=1 k=1 @=1 


and 
d 
> (Vijkem — Vijee.)’> 


with the usual notations of dots and bars. The corresponding mean squares de- 
noted by MSaz, MS ava), MSc,z), MSp, MSap, MS ppv); MScp,a); and MS- 
are obtained by dividing the sums of squares by the respective degrees of free- 
dom. The resultant analysis of variance is summarized in Table 8.2. The proper 
test statistic for any main effect or interaction of interest can be obtained from 
an examination of the analysis of variance table. The variance components es- 
timates as usual are obtained by solving the equations obtained by equating the 
mean squares to their respective expected values. 


8.5 WORKED EXAMPLE FOR MODEL Il 


Schultz (1954) discussed the results of an analysis of variance performed on 
data on the calcium, phosphorous, and magnesium content of turnip leaves. The 
data were obtained as follows. “Duplicate [microchemical] analyses were made 
on each of four randomly-selected leaves from each of four turnip plants picked 
at random... . Duplicate determinations were made on each ash solution from 
a particular leaf..... The analyses of the two sets of ash solutions were made 
-at different times.” The analysis of variance for the calcium data are given in 
Table 8.3. 

It is evident from the structure of the experiment that the plants are crossed 
with ashings and leaves are nested within plants. Since both plants and leaves 
within plants were randomly selected, both these factors should be regarded as 
random. In addition, the factor ashing should also be assumed as random inas- 
much as two ashings might be regarded as coming from repeated experiments 
on the same leaves (a new random sample from each leaf might be taken at 
future periods, ashed, and then analyzed in duplicate). 


The Analysis of Variance 


440 


"pauleygo oq A] IeTIWIS ued sorenbs Uva pa}oadxa Zulpuodsal09 puv sj[qissod are sjapow paxy Jo sadAj snoueA , 


Ey 


70 70 ISW 
1=?7 1=¥ 1=f 1=! 
— — 2)qv 
ou + a9 UDCA Ce (I PM )q $20 Margy 
Pp o92 q oO 
I=? | j=! |=! 2 
*Poua + etou 4 70 (aL (og) KK CI Pt OP, (WdIgn 
p 
gp ous Sou 4 2 = (1 PI») ay 
Pourg + *our + %ou + 70 ?4( 970) K “<< Soaagr ee SW 
D 
I=? }_p 
pousgn + *oug + *Foua Se 7Ou + 70 ae < om + 70 dsw 
P 
‘ 1I=y¥1=f =! (| —2)qv 
Loup + Mou 2 MRR LO 5 to wo91 
I qo 
d A ed gd a (yf SS (I — 9)P a 
zoupo + 7oup + zou + zou + 7? fd ¢ “<< ips + 70 (WAST 
q sD 
7oupog + Joupa + 
l=! 
, | —D 
coup + ourg + "ous +4 8 fou + jo jo ¢ D4 + 70 Vow 
D 
1 |apPOW 1 |Ppow asenbs 
uraw 


,adenbs urea pa}dadxy 


LSS 
ISS 


(MdISs 


(WdIss 


avss 


ass 


(MISg 


(Wass 


YSS 


sasenbs 
jo wins 


[= @Ppogp 
(| — u)poqv 


(1 — PC — 9)q0 


(1 — PL — 9)? 


(I — PL — ¥) 


wopaa.4 
jo saaidaq 


[BIOL 
JOM 

(q uly) 
d X QJo1eng 


(vy uryiIa) 
gd xX golong 


qd xX y ovang 


qd o1 ang 


(q Ulm) 
3 01 nq 


(vy ulm) 
g ov ond 


y oy ong 


UOIJELILA 
JO 391N0S 


(L’b') JaPOW 404 aduRLAIRA Jo siskjeuy 


C8 419aV1 


Partially Nested Classifications 441 


TABLE 8.3 
Analysis of Variance for the Calcium Content of Turnip Leaves Data 
Source of Degrees of Mean Expected 
Variation Freedom Square Mean Square F Value p-Value 
Plants 3 6.202154 07 +203. 4) +4 x 203, 
+2 x 20} q) +4 x2 x 2aZ 
Leaves 12 0.605917 07 +2054) +2 2054 43.38  <0.001 
(within plants) 
Ashings I 0.02945 of + 208.4) +4 x 2og, 0.88 0.417 
+4x4x 20? 
Plants x Ashings 3 0.033569 o2 + 205 4a) +4x 205, 2.40 0.119 
Ashings x Leaves 12 0.013968 07 + 203.4) 3.92  <0.001 
(within plants) 
Error 32 0.003560 a2 
Total 63 


Source: Schultz (1954). Used with permission. 


The mathematical model for the experimental design would be 


Vigne = M+ 0; + Byiy + Ve + (OY diz + (BY) jaciy + Ceci : 
£ 


where ju is the general mean, a; 1s the effect of the i-th plant, 6; is the effect 
of the j-th leaf within the i-th plant, y; is the effect of the k-th ashing, (wy)jx 1s 
the interaction of the i-th plant with the k-th ashing, (By) jx i) 1S the interaction 
of the j-th leaf with the k-th ashing within the i-th plant, and eg(;;x) 1s the 
customary error term (analysis in duplicates). Under the assumption that all the 
effects are random, the a;’s, Bjiiy’s, Ye’S, (@Y )ix’S, (BY) jxciy’S, aNd egcijxy’S are 
normally distributed with zero means and variances 07, f(a), Os gy» Thy (ay 
and o?, respectively. 

The test of the null hypothesis that the variance component due to a particular 
source is zero can be based on the ratio of the mean square of the source to 
that mean square whose expectation is the same as the expectation of the mean 
square being tested except for the component due to the source of variation 
being tested which is equal to zero under the null hypothesis of no effect. 
Thus, the ashings x leaves (within plants) interaction is tested against the error 
mean square and the difference is highly significant (p < 0.001). Similarly, the 
plants x ashings interaction is tested against ashings x leaves (within plants) 


442 The Analysis of Variance 


and has probability of occurrence greater than 10 percent due to chance alone 
(p =0.119). Among main effects, ashings is tested against plants x ashings 
and is not found to be significant (p =0.417); leaves (within plants) is tested 
against ashings x leaves (within plants) and is found to be highly significant 
(p < 0.001). The test of significance for plants, however, does not have an exact 
test. As discussed in Section 5.5, an approximate test may be constructed using 
the test statistic 


7 MS, 
MSac + MSga) — MSacyay | 


/ 


which has an approximate F distribution with df, and v’ degrees of freedom 
where 


sie (MSac + MSavay — MSgcyay) 
(MSac) if (MS g(a)” * (MSgccay)” 
dfac Af BA) Af Bc(A) 


In the example, the values of F’ and v’ (rounded to the nearest digit) are found 
to be: 


_ 6.202154 hs 
~ 0.033569 + 0.605917 — 0.013968 


/ 


and 


(0.033569 + 0.605917 — 0.013968) 
yp=-=—— 
(0.033569) (0.605917)? (0.013968) 
3 a 12 12 


From Appendix Table V, it is found that the values as large as 9.92 at 3 and 13 
degrees of freedom occur in less than 1 percent of trials due to chance alone 
(p =0.001). Thus, there is strong evidence that plants are to be regarded as 
differing significantly in calcium content. 

As pointed out in Section 5.5, an alternate test for plants may be based on 
the statistic 


pr _ MSa+MSacuy 
MSac + MSava) 


which has an approximate F distribution with v; and vy degrees of freedom 
where 


Wo (MS, + MSacay)” 
' (MS4)?_— (MSaccay)’ 
af p Af gcc) 


Partially Nested Classifications 443 


and 

» —  (MSac + MSgvay)? 

> (MSacy’ re (MS gay)? 

af ac af pa) 
Again, in the example at-hand, the values of F”, v;’, and vj (rounded to the 
nearest digit) are found to be: 
_ 6.202154 + 0.013968 
0.033569 + 0.605917 
» — (6.202154 + 0.013968)" 
' (6.202154) — (0.013968)? _ 
a 


w" 


= Oke. 


9 


and 


, (0.033569 + 0.605917)” 
"2 * (0.033569)? (0.605917. 
3 as 12 
These values are essentially the same as for the test statistic F’ and we reach 


exactly the same conclusion as before. 
Finally, the estimates of the variance components are give by 


6; = 0.00356, 


] 
q? 5 (0.013968 — 0.003560) = 0.00520, 


OBy(a) = 


ay” 


] 
C= 8 (0.033569 — 0.013968) = 0.00245, 


I 
6, = z5 (0.02945 — 0.033569) = —0.00013, 


] 
re ri (0.605917 — 0.013968) = 0.14799, 


and 


l 
= 16 (6.202154 — 0.033569 — 0.605917 + 0.013968) = 0.34854. 


Assuming that a, = 0, these variance components account for 0.7, 1.0, 0.5, 0.0, 
29.2 and 68.6 percent of the total variation. Thus, the largest single source of 
variability is attributable to variation between plants and may account for nearly 
70 percent of the total variation. Leaves within plants are also quite variable 
and may account for most of the remaining variation. Although the effect due 
to ashings x leaves (within plants) interactions is statistically significant, it 
accounts for only 1 percent of the total variation. 


444 The Analysis of Variance 


TABLE 8.4 
Measured Strengths of Tire Cords from Two Plants Using Different 
Production Processes 
Distance (yds) 
0 500 1,000 1,500 2,000 2,500 


ED |S 


Plant Bobbin 1 2 #1 2 #7 2 #7 2 1 2 #1 2 
1 11). =f <5. 22 =8 <2 Se et Oo eI 
2(1) 1 10 1 x 9 2 10 —-4 -4 3 4 8 
31) 2 3 5 -5 1 -1 -6 tL 2 5 7 5 
4(1) 6 10 1 5 0 5 2 —2 1 1 5 9 
B(1)> 0. SB. SS TOY AS) ed. RP SS 6 
61) -1 -10 -8 -8 -2 Se Seg: as ae 2g 
7(1) —9 —2 5 —2 7 —2 -—2 —2 -l 2 10 5 
8(1) 0 » 5 2 Ss 3 10 -1 4 1 7 -1 
2 1(2) 10 8 —5 6 2 13 7 +15 17 «+14 «+18 I 


22) 9 12 6 15 15 12 18 16 13 #10 9 U1 
3(2) 0 8 12 6 2 0 5 4 18 8 6 8 
42) 5 9 2 16 15 5 21 18 15 It 18 15 


52) -1 -1 lt 19 12 10 1 2 13 9 4 6 
62) 7 1 15 WW 12 12 8 12 22 I 12 21 
7(2) —5 1 2 10 12 1 2 #1 10 #10 #7 = 5 


8(2) 10 9 10 1 9 6 12 WW 1 20 WU 15 


Source: Akutowicz and Traux (1956, Table 1, p. 4). Used with permission. 


8.6 WORKED EXAMPLE FOR MODEL Ill 


Akutowicz and Traux (1956) described an experiment designed to investigate 
the variability of the strength of tire cord. Prior to the establishment of control of 
cord testing laboratories, data were obtained from two plants that used different 
production processes to make nominally the same kind of tire cord. A random 
sample of eight bobbins of cord was selected from each plant and six 500-yard 
intervals over the length of each bobbin were determined. In order to give as 
nearly as possible “duplicate” measurements, adjacent pairs of breaks were 
made at each interval measuring the recorded strength in 0.1 lb deviations from 
21.5 lb. The coded raw data are given in Table 8.4. 

It is evident that the structure of this experiment is somewhat different from 
the crossed and nested classification models discussed in the earlier chapters. 
If the bobbins were crossed with the plants, so that the first bobbin with plant 
1 had some correspondence with the first bobbin with plant 2, and the second 
bobbin in like manner, and so on, then one would have a three-way crossed 
classification with replication in the cells. However, clearly, this is not the 
situation. The bobbins are not crossed with plants; rather they are nested within 


Partially Nested Classifications 445 


the plants since eight bobbins of cords were selected at random from each 
plant. Similarly, if the distances were obtained as random samples from each 
bobbin, so that they were nested within the bobbins, with no crossing between 
distance 1 and bobbins or plants, then one would have a completely nested or 
hierarchical classification. However, again, this is not so. In this experiment, the 
distances were chosen at 500-yard intervals over the length of each bobbin and 
thus constitute a fixed effect which is crossed with bobbins and plants. Thus, the 
experimental structure conforms to the partially nested or hierarchical design, 
described earlier in this chapter, where the distances are crossed with the plants 
and bobbins and bobbins are nested within plants. 
The mathematical model for the experimental design would be 


? 


os eee 

i eee 0 
Yijke = M+ 0; + Byiy + Ve + (AY dik + (BY) jeciy + Cec jry k=1.2.....6 
| a 


3 3 


where pu is the general mean, a; is the effect of the i-th plant, 6;(;) is the effect of 
the j-th bobbin nested within the i-th plant, ), is the effect of the k-th distance, 
(ay )jx 1S the interaction of the i-th plant with the k-th distance, (By) jx ;) is the 
interaction of the k-th distance with the j-th bobbin within the 7-th plant, and 
€g(ijk) 1S the customary error term. Furthermore, the a;’s, yg’s, and (ay )j,’s are 
fixed effects with the constraints: 


2 6 


6 
Yi =0, Yin =0, Yer. =0 Seri, =0: 


i=] k=1 i=] k=1 


and the Bji)’s, (BY) jeg) S, and €¢(;;,)’S are random effects that are independently 
and normally distributed with mean zero and variances OFta)? OF a)? and Ge. 
respectively. 

To calculate the sums of squares, we first form the bobbins within plants 
totals (y;;..), plant x distance totals (y;.x.), plant totals ();...), distance totals 
(y.x.), and grand total (y....), as shown in Table 8.5. The other quantities needed 
in the calculations of the sums of squares are: 


> _ (1,016)? 
2x8x6x2 °»#192 
2 


2 6 
SSL vue = (CD? + 1? ++ + (15)? = 15,788, 


= 5,376.333, 


f=) ysl kerf] 
1 2 8 6 (—6)* + (—10)? + --- + (26) 
i=1 j=1 k=1 
PAs, 9 (-31)?4 (35)? +--+ (156) 
Ss eS 347. 
wore 12 eens 


446 The Analysis of Variance 


TABLE 8.5 
Calculation of Cell and Marginal Totals 
Yijk. 
Distance (yds) 
Plant Bobbin 0 500 1,000 1,500 2,000 2,500 yi Yi... 
1 1(1) -6 —10 —7 -1 -8 -3i1 
2(1) 11 3 4 6 -l 12 35 
3(1) 1 0 0 —5 7 12 13 
4(1) 16 6 5 —4 2 14 39 31 
5(1)  -9 -5  —-4 ~3 0 9 —-12 
61) —-ll —16 0 -3 -9 -6 —45 
71) 9-11 3 5 —4 15 9 
8(1) 2 —7 8 9 5 6 23 
Yk 9 26 19-11 4 54 
2 1(2) 18 15 223 29116 
2(2) 21 21 27 3423 20s—s«*146 
3(2) 8 18 2 9 26 14 77 
4(2) 14 18 20 39 2 £33 150 985 
5(2)  -2 30 «22 21 22 10 ~—-:103 
6(2) 23 26 224 202 «33 33s«d9 
7(2)  —4 S. “27 15-20 12 78 
8(2) 19 25 25 233 «438 «~=— 6 Ss«d:56 
Yok 97 147 162 183 219 177 Y.. 
Yk 88 «121 s«181)—Ss172,—s—«i223s 28 1,016 


Se 898.290; 


1 Sy » _ (-9) +(-26)? +--- +177)" 
o 16 


1 3 » _ (31)? + (985) 


= 10,116.521, 
96 


sty? (88)? + (121) +--+ + 231) 
, a aa 


—_—_—_— = 5,869.375. 
2x8x2 32 


Now, the sum of squares for plants is an ordinary main effect; that is, 


y? 


2 
BPA errr a cas 


= 10,116.521 — 5,376.333 
= 4,740.188. 


Partially Nested Classifications 447 


The sum of squares for bobbins within plants is an ordinary nested effect; that is, 


2 8 2 


| 2 | 2 
Saw = 9 2D Bed dy 
i) i= 
= 11,347.167 — 10,116.521 
= 1,230.646. 


The sum of squares for distance is again an ordinary main effect; that is, 


y 


1 6 
SSc = —————_ Bed eee as 
- KR BKD yh 2x8x6x2 


= 5,869.375 — 5,376.333 
= 493.042. 


The sum of squares for plant x distance is an ordinary two-way interaction; 
that is, 


2x8 6 x2 
= 10,888.250 — 10,116.521 — 5,869.375 + 5,376.333 
= 278.687. 


The sum of squares for distance x bobbin within plants is 


2 2 6 


1 2 8 6 
SHC 5) DD uk Sag Dd ue <a 


i=1 j=l k=1 i=1 j=l i=1 k=1 


1 =. 
13,697 — 11,347.167 — 10,888.250 + 10,116.52] 
1,578.104. 


The total sum of squares is 
2 8 6 2 y? 
= Be ag Aza pe cts Ee Sa 
Se 22 2 Ge 
= 15,788 — 5,376.333 
= 10,411.667. 


448 The Analysis of Variance 


Finally, the error sum of squares is obtained by subtraction as 


SSe = SSr — SS4 — SSac4) — SSc — SSac — SSacia) 
= 10,411.667 — 4,740.188 — 1,230.646 — 493.042 
— 278.687 — 1,578.104 
= 2,090.99. 


The complete analysis of variance is shown in Table 8.6. 

The plants, tested against bobbins within plants, is evidently highly signi- 
ficant (p < 0.001). Similarly, bobbins within plants, tested against the er- 
ror term, also seem to differ quite significantly (p < 0.001). Distances also 
appear to differ significantly, that is, have considerable effect on cord strength 
(p = 0.002). There is some evidence of interaction between plant and distance 
(p = 0.041). However, there does not seem to be any interaction between dis- 
tance and bobbin within plants, indicating that the effect of different distances 
is probably the same for all bobbins (p = 0.426). The variance components 
a2, Opa)? and 5a) are estimated as | 


6? = 21.781, 
1 
Spy) = 5 (22.544 — 21.781) = 0.382, 
and 
a0 I 
6 p(a) = 75 (87.903 — 21.781) = 5.510. 


It is evident from the preceding analysis that there is a great deal of variability 
in the strength of tire cord and the larger part of this variability arises due to 
differences in the manufacturing processes of the two plants. The distances 
also differ quite significantly. The variability between bobbins within a given 
plant is quite large and the duplicate measurements on adjacent pairs also differ 
considerably. 


8.7 USE OF STATISTICAL COMPUTING PACKAGES 


Among the SAS procedures, PROC ANOVA and PROC NESTED cannot be 
used to analyze a partially nested model since they are written for either com- 
pletely crossed or completely nested designs. PROC GLM is the procedure of 
choice for analyzing this type of model. Again, the analysis involving a random 
or mixed effects model can be handled via RANDOM and TEST options. PROC 
MIXED or VARCOMP can be used for the estimation of variance components. 
For instructions regarding SAS commands, see Section 11.1. 


449 


Partially Nested Classifications 


9¢V 0 
1v0°0 


c00'0 


100°0> 
100°0> 


anjea-d 


b0'l 
LV? 


Lev 


b0'P 
Coes 


anjea 4 


may 


70 
Moe + 20 
=! om ‘seal 
(1-9) — @) + Mor 4 2 0 
Z CX8 
a 1-9 4 Ad 97 4 20 
9 TXBXZ ¢ 


aienbs ueaw 
paydadxq 


182 1¢ 


bes cc 
LEL Ss 


809°86 


£06 L8 
881 Orl'b 


aaenbs 
uraw 


LOO'IIP'Ol 


666'060°Z 


pOI'8Ls'1 
L89°8L2 


CVO t6b 
9P9' 0ET'I 
881 ObL'b 


saenbs 
jo wing 


bl 


wopaa.4 
jo saaisaq 


[10], 


JIOMA 
(sjueyd uly) 
suiqqog x saoue)sIq 


SOOURISIC] X SUL] 


SoOUeISIC] 


(syuejd ulyIM) 
suiqqog 


sque]d 


UOIPLLILA 
JO 304Nn0S 


pS Jjquy jo LLG Y}SUdA}S SP407 JAly JY} AOJ BDULLALA JO SISA|eUY 


9°38 414aV1 


450 The Analysis of Variance 


DATA STRENGTHS; The SAS System 

INPUT PLANT BOBBIN General Linear Models Procedure 

DISTANCE $ DUPLICA Dependent Variable: STRENGTH 

STRENGTH; Sum of Mean 

DATALINES; Source DF Squares Square F Value Pr > F 
1 OYD 1 -1 Model 95 8320.6666 87.5859 4.02 0.0001 
-5 rror 96 2091.0000 21.7812 


CYD 2 

OYD 1 1 Corrected 191 10411.6666667 

OYD 2 10 Total 
1 
2 


fea fea 


OYD 2 R-Square c.V. Root MSE STRENGTH Mean 
OYD -3 0.799168 88.196006 4,.6670387 5.29166667 
2G . 7 . Source Type TII SS Mean Square F Value Pr> F 
2 2500YD 2 15 PLANT 4740.1875 4740.1875 217.63 0.0001 
‘ BOBBIN (PLANT) 1230.6458 87.9033 4.04 0.0001 
PROC GLM; DISTANCE 493.0417 98.6083 4.53 0.0009 
CLASSES PLANT BOBBIN PLANT* DISTANCE 5 278.6875 55.7375 2.56 0.0322 
DISTANCE; BOBBIN*DISTAN(PLANT) 70 1578.1042 22.5443 1.04 0.4340 
MODEL STRENGTH = PLANT Source Type III Expected Mean Square 
BOBBIN(?LANT) DISTANCE PLANT Var(Ezrzor) + 2 Var (BOBBIN* DISTAN (PLANT) ) 
PLANT* DISTANCE +12 Var(BOBBIN(PLANT)) + Q( PLANT, PLANT* DISTANCE) 
BOBBIN* DISTANCE (PLANT); BOBBIN (PLANT) Var(Error) + 2 Var(BOBBIN* DISTAN (PLANT) ) 
RANDOM BOBBIN( PLANT) + 12 Var(BOBBIN (PLANT) )} 
BOBBIN* DISTANCE (?LANT) ; DISTANCE Var(Error) + 2 Var (BOBBIN* DISTAN (PLANT) ) 
TEST H=PLANT + Q(DISTANCE, PLANT* DISTANCE) 
E=BOBBIN (PLANT) PLANT* DISTANCE Var(Error) + 2 Var ({(BOBBIN* DISTAN (PLANT) ) 
TEST H=DISTANCE + Q(PLANT* DISTANCE) 
E=BOBBIN* DISTANCE (PLANT) BOBBIN* DISTAN (PLANT) Var(Error) + 2 Var (BOBBIN* DISTAN (PLANT) ) 
TEST H=PLANT* DISTANCE Tests of Hypotheses using the Type IIT MS for BOSBIN(PLANT) as an 
E=BOBBIN* DISTANCE (PLANT); | error term 
RUN; Source DF Type III SS Mean Square F Value Pr > F 
CLASS LEVELS VALUES PLANT 1 4740.1875 4740.1875 53.93 0.0001 
PLAN? 2 4 Tests of Hypotheses using the Type TIT MS for BOBBIN* DISTAN (PLANT) 
BOBBIN 8 as an error term 
Source DF Type III SS Mean Square F Value Pr > F 

DISTANCE 6 DISTANCE 5 493.0416 98.6083 4.37 0.0016 

OYD 500YC 1000YD Tests of Hypotheses using the Type IIT MS for BOBBIN*DISTAN(PLANT) 
1500YD 2000YD 2500YD as an error term 
NUMBER OF OBS. IN DATA Source DF Type III SS Mean Square F Value Pr > F 
SET=192 PLANT* DISTANCE 5 278.6875 55.7375 2.47 0.0403 


tp op 


(i) SAS application: SAS GLM instructions and output for the partially nested mixed 
effects analysis of variance. 


DATA LIST Tests of Between-Subjects Effects Dependent Variable: STRENGTH 
/PLANT 1 
BOBBIN 3 Source Type III SS df Mean Square 
DISTANCE 5 PLANT Hypothesis 4740.188 1 4740.188 
DUPLICA 7 Error 1230.646 14 87.903 (a) 
STRENGTH 9-11. BOBBIN (PLANT) Hypothesis 1230.646 14 
BEGIN DATA. Error 1578.104 70 -344 (b) 
-1 DISTANCE Hypothesis 493.042 5 . 608 
-5 Error 1578.104 70 -344 (b) 
1 PLANT* DISTANCE Hypothesis 278.688 5 .738 
10 Error 1578.104 70 .344(b) 
2 BOBBIN* DISTANCE Hypothesis 1578.104 70 -544 
-3 (PLANT) Error 2091.000 96 21.781 (c) 
aaeee é a MS(BOBBIN(PLANT)) b MS(BOBBIN* DISTANCE (PLANT)) c MS (Error) 
62 15 
END DATA. Expected Mean Squares (a,b) 
STRENGTH BY Variance Component 
PLANT Source Var (B(P)) Var (B* D(P)) Var (Error) Quadratic Term 
BOBBIN PLANT 12.000 2.000 1.000 Plant 
DISTANCE BOBBIN (PLANT) 12.000 2.000 1.000 
DUPLICA DISTANCE . 000 2.000 1.000 Distance 
/DESIGN PLANT PLANT* DISTANCE . 000 2.000 1.000 Plant* Distance 
BOBBIN (PLANT) BOBBIN* DISTANCE (PLANT) -000 2.000 1.000 
DISTANCE Error .000 .000 1.000 
PLANT* DISTANCE a For each source, the expected mean square equals the sum of the 
DISTANCE* BOBBIN |coefficients in the cells times the variance components, plus a quadratic 
(PLANT) term involving effects in the Quadratic Term cell. b Expected Mean Squares 
/RANDOM BOBBIN. are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the partially nested mixed 
effects analysis of variance. 


FIGURE 8.2 Program Instructions and Outputs for the Partially Nested Mixed 
Effects Analysis of Variance: Measured Strengths of Tire Cords from Two Plants 
Using Different Production Processes (Table 8.4). 


Partially Nested Classifications 451 


FILE='C:\SAHAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\ TEXTO\ EJE22.TXT!. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 
VARIABLES=2. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
/VARIABLE NAMES=REP1,REP2. SOURCE ERROR SUM OF D.F. MEAN F PROB. 
/DESIGN NAMES=PLANT, DEISTANCE, TERM SQUARES 
BOBBIN, REPLIC. MEAN B(>) 5376.333 : , .0000 
LEVELS=2, 6,8,2. PLANT B(P) 4740.188 ‘ . .0000 
RANDOM=BOBBIN, REPLIC. DISTANCE DB(P) 493.042 : 7 .0016 
FIXSD=PLANT, DISTANCE. B(P) R(PDB) 1230.646 7 Ee -0000 
MODEL='P,B(P),D,R(PBD)'. PD DB(P) 278.688 : : -0403 
DB (P) R(PDB) 1578.104 : . -4343 
R (PDB) 2091.000 


SOURCE EXPECTED MEAN ESTIMATES OF 
SQUARE VARIANCE COMPONENTS 

MEAN 192 (1) +12 (4)+(7) -543912 
PLANT 96(2)+12(4)+(7) -46129 
ANALYSIS OF VARIANCE DESIGN DISTANCE 32 (3) +2(6)+(7) - 37700 
INDEX B(P) 12 (4)+(7) -91017 
NUMBER OF LEVELS PD 16(5)+2 (6)+(7) -07457 
POPULATION SIZE INF INF DB (P) 2(6)+(7) ~38155 
MODEL F, B(P), D,R(PBD) R( PDB) (7) -78125 


(iii) BMDP application: BMDP 8V instructions and output for the partially nested mixed 
effects analysis of variance. 


FIGURE 8.2 (continued) 


Among the SPSS procedures, either MANOVA or GLM could be used for 
the analysis involving random and mixed effects models. For the estimation of 
variance components, SPSS VARCOMP will be the procedure of choice. For 
instructions regarding SPSS commands, see Section 11.2. 

Among the BMDP programs, 3V or 8V can be used for partially nested 
designs. For designs with balanced structure, 8V is preferable; 2V can also be 
used but the special methods of combining crossed factor sums of squares must 
be used for obtaining sums of squares corresponding to nested factors. 


8.8 WORKED EXAMPLE USING STATISTICAL PACKAGES 


In this section, we illustrate the application of statistical packages to perform 
partially nested analysis of variance for the data set of the example presented 
in Section 8.6. Figure 8.2 illustrates the program instructions and the output 
results for analyzing data in Table 8.4 using SAS GLM, SPSS GLM, and BMDP 
8V. The typical output provides the data format listed at the top, all cell means, 
and the entries of the analysis of variance table. Note that the results are the 
same as those provided using manual computations in Section 8.6. 


EXERCISES 


1. An experiment is designed to study the performance of three differ- 
ent lathes. Each lathe has three different speeds where the product 
is manufactured and each was operated at two different feed rates. 
The runs are made in random order and three observations are taken 
from each speed. The relevant data in certain standard units are as 


452 


The Analysis of Variance 


follows. 
Lathe 1 TT Hl 
Speed 1 2 3 1 2 3 1 2 3 
Feed rates 
Low 41.2 408 43.3 39.2 402 39.9 399 40.9 40.7 


374 41.9 43.9 406 418 428 40.1 405 39.9 
38.7 42.1 442 41.1 409 414 40.2 398 38.8 


High 31.4 35.2 32.8 31.2 31.2 33.1 31.3 303 31.8 
33.4 374 33.2 32.1 32.2 342 33.2 345 29.1 
34.2 36.7 31.9 334 349 30.9 324 33.1 31.9 


(a) Describe the model and the assumptions for the experiment. It 
is assumed that all three factors are fixed. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in lathes. Use a = 0.05. 

(d) Test whether there are differences in feed rates. Use a = 0.05. 

(e) Test whether there are differences in speeds. Use a = 0.05. 

(f) Suppose that a large number of feed rates are available, and 
the two taken for the experiment are selected randomly. Modify 
the analysis of variance in part (b) to give the expected mean 
squares for this case and estimate the variance components of 
the model. 


. An experiment is designed to study the microhardness of high- 


strength steel purchased from three different foundries. Each foundry 
supplied the steel in three different lengths of bars: 3.0, 3.50, or 4.0 
inches. Inasmuch as the production of different lengths of bar from 
a common ingot required different extrusion techniques, this fac- 
tor may be important. Moreover, the bars were forged from ingots 
produced at different temperatures. Each foundry provided two test 
specimens of each bar from three different temperatures. The result- 
ing data in certain standard units are as follows. 


Foundry A i I 


Temperature °C °C ne 
1100 1200 1300 1100 1200 1300 1100 1200 1300 


Bar Length (in.) 


3.0 1.841 1.957 1.846 1.912 1.957 1.926 1.858 1.886 1.935 
1.869 1.911 1.817 1.874 1.993 1.931 1.897 1.879 1.926 
3.5 1.927 1.919 1.861 1.885 1.995 1.957 1.884 1.871 1.993 
1.911 1.973 1.849 1.879 1.986 1.968 1.875 1.876 1.975 
4.0 1.898 1.957 1.884 1.858 1.973 1.947 1.912 1.891 1.929 


1.893 1.993 1.826 1.826 1.939 1.953 1.873 1.882 1.934 


Partially Nested Classifications 453 


(a) Describe the model and the assumptions for the experiment. It 
is assumed that all three factors are fixed. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in the microhardness of steel 
purchased from the three foundries. Use a = 0.05. 

(d) Test whether there are differences in the microhardness of steel 
of different lengths of bar. Use a = 0.05. 

(e) Suppose that bars may be acquired in many lengths and the three 
lengths being used in the experiment were selected randomly. 
Make the necessary modifications in the analysis of variance in 
part (b) to reflect the expected mean squares for this situation 
and estimate the variance components of the model. 

3. Brownlee (1965, p. 544) reported data from an experiment invelving 
five laboratories that participated in measuring the brightness of six 
lamps of each of two types. The brightness of each lamp type was 
measured in all five laboratories. The data on values of candle power 
measured at different laboratories are as follows. 


Laboratory* 
Type Lamps A B C D E 
I 1 741 768 770 772 738 
2 731 763 755 742 724 
3 731 763 757 760 728 
4 759 779 775 774 752 
) 738 758 750 750 730 
6 770 795 800 800 768 
i 1 625 650 655 651 615 
2 590 611 605 625 588 
3 602 630 640 630 605 
4 578 607 640 608 581 
) 578 604 605 608 573 
6 625 673 670 664 631 


* All figures have been multiplied by 100 and then 1,000 subtracted from them. 
Source: Brownlee (1965, p. 544). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in laboratories. Use a = 0.05. 

(d) Test whether there are differences in lamp types. Use a = 0.05. 

(e) Test whether there are differences in lamps within types. Use 
a = 0.05. 

(f) Determine 95 percent confidence limits for the difference be- 
tween lamp types. (It is assumed that the laboratory x lamp type 
interaction is negligible.) 


454 The Analysis of Variance 


(g) Determine 95 percent confidence limits for the difference be- 
tween laboratory A and laboratory E. (Again it may be assumed 
that laboratory x lamp type interaction is negligible.) 

(h) Determine 95 percent confidence limits for the difference be- 
tween lamp type [in laboratory A and lamp type II in laboratory 
B. 

(1) Estimate the component of variance for lamps within types. 

4. Brownlee (1965, p. 545) reported data from an experiment involving 
five laboratories that took part in a test comparison of their mea- 
surement procedures for evaluating the impact strength of a type of 
fiberboard. Panels from two batches of board were tested by each 
of the five laboratories for each batch in duplicate on three days. 
The three days reported in the experiment were different for each 
laboratory. The data on impact strengths are as follows. 


Laboratory 
Day Batch A B C D E 
1 1 1483 1449 1499 1428 1509 


1496 1400 1472 1401 = 1439 


2 1504. 1465° 1506 1407 1480 
1505) 1423) 1537) «1416s: 1429 


2 1 1441 1477 1483 1404 = 1416 
1416 1471 1509 1419 144] 


2 1477) 1418) = 1578) 1455s: 1364 
1457 «©1445 1486) 1435 144] 


3 1 1450 1446 1489 1414 1419 
1478 1398 1435 1446 = 1444 


2 1435, 1424 1499 1423 1437 
1478 1426 1491 1442 = 1438 


Source: Brownlee (1965, p. 545). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are differences in laboratories. Use a = 
0.05. 

(d) Test whether there are differences in days. Use aw = 0.05. 

(e) Test whether there are differences in batches. Use a = 0.05. 

(f) Determine 95 percent confidence limits for the mean difference 
between laboratories A and E. 

(g) Estimate the within day component of variance averaged over 
laboratories. 


Partially Nested Classifications 


455 


(h) Estimate the between day component of variance averaged over 


laboratories. 


5. Desmond (1954) reported the results of an experiment in testing the 
operation of voltage regulators involving four setting stations. From 
each of these stations, three regulators were randomly selected and 
each was tested at four test stations. The data are given as follows. 


Setting Regulator 
Station No. 1 


1 16.5 
15.9 


16.9 


16.7 
17.0 
16.3 


17.0 
16.6 
16.3 


16.8 
16.1 
16.2 


ON =| WAN = |DNHD = WHND = 


Test Station 
2 3 
16.1 16.2 
15.4 15.8 
15.9 16.0 
16.1 15.7 
16.4 16.4 
16.1 16.1 
16.1 15.8 
16.3 15.9 
15.9 16.2 
16.7 16.3 
16.0 16.0 
16.1 16.1 


Source: Desmond (1954). Used with permission. 


(a) Describe the model and the assumptions for the experiment. 
Assume a fixed effect model for setting stations and test sta- 


tions. 


(b) Analyze the data and report the analysis of variance table. 
(c) Test whether there are differences in test stations. Use a 


0.05. 


(d) Test whether there are differences in setting stations. Use a 


0.05. 


(e) Test whether there are differences in regulators within setting 


stations. Use a = 0.05. 


(f) Present a table of the means of each setting station with estimated 


standard error for each mean. 


(g) Determine 95 percent confidence limits for the mean differences 


between test stations | and 4. 


6. Consider the experiment described 1n the worked example in Section 
8.5. The analysis of variance for the phosphorous data 1s performed 
in exactly the same manner as for the calcium data and the results 


are as follows. 


456 


The Analysis of Variance 


Analysis of Variance for the Phosphorous Content of Turnip 
Leaves Data 


Source of Degrees of Mean Expected 
Variation Freedom Square Mean Square 
Plants 3 0.056375 of + 20%, @) + 4 x 203, 


+2 x 20} +4 x 2x 203 


Leaves 12 0.035786 =o? +: 265, q) +2 x 205) 
(within plants) 

Ashings 1 0.000467 62 +. 20%, 4) +4 x 202, 

+4x4x 202 

Plants x Ashings 3 0.000664 0? + 20}, +4 x 204, 

Ashings x Leaves 12 0.000935 ae + 20 aes 
(within plants) 

Error 32 0.000457 a? 

Total 63 


Source: Schultz (1954). Used with permission. 


(a) Test whether there are differences in effects due to plants. Use 
a = 0.05. 

(b) Test whether there are differences in effects due to leaves within 
plants. Use a = 0.05. 

(c) Test whether there are differences in effects due to ashings. Use 
a = 0.05. 

(d) Test whether there are differences in effects due to plants x ash- 
ings. Use a = 0.05. 

(e) Test whether there are differences in effects due to ashings x 
leaves within plants. Use a = 0.05. 

(f) Estimate the variance components of the model and determine 
95 percent confidence intervals on them. 


7. Anderson (1954) reported the results of an experiment designed to 


compare the absorption properties of ceramic compositions. There 
are 15 ceramic compositions and the experiment was performed 
under three different temperatures. Two batches of each composi- 
tion were prepared and two firings were made at each temperature. 
Finally, one observation was made for each firing of a batch, giving 
a total of 180 observations. The mathematical model for this exper- 
iment would be: 


oS) 


~~ 


Yijke = + 0; + By) + Ve + Sere) + (OY dik 
+ (a9 Jiecey + (BY) jc) + Ceciiey 


J 

J 
peanasd 
7 


Yd 


SS aN Be om, 
i dl 

ee a ee 

bo WhO dO bO 


 ) 
~~ 


Partially Nested Classifications 457 


where jz is the general mean, a; is the effect of the i-th tempera- 
ture, Bj) 1s the effect of the j-th firing within the i-th temperature, 
Vx 1s the effect of the k-th composition, 5%) 1s the effect of the €-th 
batch within the k-th composition, (ay )j;x 1s the interaction of the 7-th 
temperature with the k-th composition, (@5)j¢z) 1s the interaction of 
the i-th temperature with the £-th batch within the k-th composition, 
(BY) jxG) 18 the interaction of the j-th firing with the k-th compo- 
sition within the i-th temperature, and é¢(;;,) 18 the customary error 
term. Note that it is assumed that there are no (B98) jez) interactions. 
Under the assumption that all effects are random, the a;’s, Bj(y’s, 
Ve°S, Secny’S, (AY )ik’S, (5)iecRy’S, (BY) jxciy’S, aNd egg jxy)’S are nor- 
pus per OUie with eos Zero and variances Ox Ra)? oy Oh)» 
Say> Fas(y)? Tpy(ay and o,,, respectively. The analysis of variance ta- 
ble is given as follows. 


Analysis of Variance for the Ceramic Compositions Data 


Source of Degrees of Mean Expected 
Variation Freedom Square Mean Square 
Temperatures 2 1,179.9900 a2 + 205 (a) + 2oZ) 


+40%, + 305(q) + 6007 


Firings 3 0.1521 62 +263 4) + 300} 
(within temperatures) 
Compositions 14 10.3400 07 + 203. + 205) 


+4of, + 605, + 1207 


Batches 15 0.7405 of + 20%5ty) a 605 ,) 
(within compositions) 

Temperatures x 28 1.1130 a2 + 208 (a) + 26d) 
Compositions a 403, 

Temperatures x Batches 30 0.0857 oa + 202 5y) 


(within compositions) 
Firings x Compositions 42 0.0818 of +205 (a) 
(within temperatures) 
Error 45 0.0631 a2 


e 


Source: Anderson (1954). Used with permission. 


(a) Test whether there are differences in absorption properties among 
different temperatures. Use a = 0.05. 

(b) Test whether there are differences in absorption properties among 
firings within temperatures. Use aw = 0.05. 

(c) Test whether there are differences in absorption properties among 
compositions. Use a = 0.05. 


458 The Analysis of Variance 


(d) Test whether there are differences in absorption properties among 
batches within compositions. 

(e) Test for the following interaction effects: temperatures x com- 
positions, temperatures x batchs (within compositions), and 
firings x compositions (within temperatures). 

(f) Determine the point and interval estimates for each of the vari- 
ance components of the model. 

8. Consider a variation of the four-factor partially nested classification 
described in Section 8.4, where now b levels of factor B are nested 
under each level of A and d levels of factor D are nested under each 
level of C; that is, model (8.4.1) is now given by 


Vijkem = KM + a; + Bjay + Ve + Sey + (@Y Dik 
+ (a5) peck) + (BY) jay + Cmcijxe) 


Soa eu. 
Il 

— a a a 

s> Qa &OQ 


9 


where the meaning of each symbol and the assumptions of the model 

are readily stated. Note that it is assumed that certain second- and 

higher-order interactions are zero. 

(a) Describe the assumptions of the model under Models I, I, and 
III. Under Model III assume that factors A and C are fixed and 
B and D are random. 

(b) Develop the analysis of variance including expected mean squ- 
ares under the assumptions of Models I, II, and III. 

(c) Describe appropriate F tests for the effects of factors A, B, C, 
and D and their interactions for all the three models assuming 
normality for the random effects. 

9. Consider a three-factor study where factors A and B are crossed and 
factor C is nested within factors A and B. Further, suppose that A, 
B, and C are all random having a, b, and c levels respectively and 
there are n replications in each cell. The mathematical model for this 
type of layout would be 


Yijke = M+; + Bj + (QB); + Vecijyy + Cece) 


a li Pa 
| 


where @; is the effect of the i-th level of factor A, 6; is the effect of the 
j-th level of factor B, (wf); ; is the interaction between the i-th level 
of factor A and the j-th level of factor B, 7,,;;) is the effect of the k-th 
level of factor C within the combination of the i-th level of actor A and 
the j-th level of factor B, and e¢(;;,) is the customary error term. It 1s 


Partially Nested Classifications 459 


assumed that the random effects a;’s, Bji)’s, (@B)i;’S, Yeujy’S, and 
€e(ijk)’S are all mutually and completely uncorrelated random vari- 
ables with zero means and variances o2, o?, zg, Oo ap) and o? re- 
spectively. The overall sum of squares is partitioned as 


SSr = SS,4 + SSg + SSaz + SSccasy) + SSE, 


SScc(as) =n De 2 Sisk. — yi)’, 


and 


ion) 
= 


b 
SSe=) >>> (Vigne — Yin.) 


Finally, the analysis of variance table for this model is shown as 


follows. 

Source of Degreesof Sumof Mean Expected 

Variation Freedom Squares Square Mean Square 

Factor A a—1 SSA MS, oa? + NO ap) + cnoJ, + bcno2 

Factor B b-1 SSB MSz a2 + nO ap) + cnoJ, + acno, 
2 2 2 

Interaction A x B (a—I1)(b-—1) SSap MS a4B Og + NO) apy + cnogp 

Factor C ab(c—1) — SSccasy MScrasy 9% +20; 48) 

(within A and B) 
Error abc(n—1) SSE MSgE a2 


(a) Develop the results on expected mean squares using the rules 
given in Appendix U. 

(b) Assuming normality, determine the tests of hypotheses for 
testing the effects corresponding to factors A, B, C, and the 
interaction A x B. 


460 


10. 


The Analysis of Variance 


(c) Determine the estimators of the variance components based on 
the analysis of variance procedure. 

(d) Repeat parts (a) through (c) for a mixed model analysis where 
A is fixed and B and C are random. 

(ec) Repeat parts (a) through (c) for a mixed model analysis where 
A and B are fixed and C is random. 

Consider a four factor study where B is nested within A, D is nested 

within A, B, and C; and A and C are crossed. Further, suppose that A, 

B,C, and D are all random having a, b, c, and d levels respectively 

and there are n replications in each cell. The mathematical model for 

this type of layout would be 


Vijkem = + aj + Bjiy + VE + (AY dix : 
+ (BY) jx) + Secijky + mci jke) e 
m 


I 
Le ce ce ce 
NNNNN 
Soh 6 Ss 


we 


where q; is the effect of the i-th level of factor A, Bj(j) 1s the effect 

of the j-th level of factor B within the i-th level of factor A, y; is the 

effect of the k-th level of factor C, (ay),,; 1s the interaction between 

the i-th level of factor A and the k-th level of factor C, (By) jxciy 18 

the interaction between the j-th level of factor B and the k-th level 

of factor C within the i-th level of factor A, dei jx) 18 the effect of 
the 2-th level of factor D within the combination of the i-th level of 

factor A, the j-th level of factor B, and the k-th level of factort C, 

and €yn(jjxe) 1S the customary error term. It is assumed that the random 

effects a;’s, Bjiy’S, Ye’S, (AY )jx’S, (BY) jxciy’S» Secijey Ss ANA Cmiijxey’S 
are all mutually and completely uncorrelated random variables with 

zero means and variances 07, O4¢)1 Fy» Fay» FBy(a)> Fiapyy? and a; 

respectively. 

(a) Develop a partitioning of the total sum of squares corresponding 
to four main effects (including two nested), the two interactions, 
and a residual term. 

(b) Report the analysis of variance table including expected mean 
squares. 

(c) Assuming normality, determine the tests of hypotheses for test- 
ing the effects corresponding to factors A, B, C, and D, and the 
interactions A x D and B x D (within A). 

(d) Determine the estimators of variance components based on the 
analysis of variance procedure. 

(e) Repeat parts (b) through (d) for a mixed model analysis where 
factors A and C are fixed and B and D are random. 


Finite Population and 
Other Models 


9.0 PREVIEW 


As discussed earlier, so far in this volume we have been primarily concerned 
with random effects models or Model II based on the infinite population theory, 
that is, when the treatments included in the experiment are assumed to be a 
random sample from a population of treatments having infinite size or when 
the experimenter selects the levels at random from a large number of possi- 
ble levels of a factor usually considered as infinite. However, as described in 
Section 1.4, there are situations when the treatments selected may be a sample 
from a finite population and then the assumptions of an infinite population may 
be inappropriate. For example, in a large laboratory, there could be a total of 
10 analysts and the data obtained on just three of them could be used to make 
inferences concerning a new method for the determination of arginine content 
as used by the entire group of 10 analysts. 

The finite population model is also of interest because if we let the population 
sizes go to infinity, then we obtain Model IJ, and if we decrease the population 
size until it equals the sample size (so that the sample comprises the entire 
population), then we obtain Model I. If some population sizes are increased to 
infinity while others decreased to sample sizes, we are in a Model III situation. 
Under finite population models, the calculations for sums of squares, degrees 
of freedom, and mean squares remain the same. The difference lies in the 
derivation of expected mean squares and consequently in the estimation of the 
parameters and the testing of hypotheses. In this chapter, we briefly present 
the results for the finite population models. These models were first considered 
by Tukey (1949 a, c, 1950), Cornfield and Tukey (1956), and Bennett and 
Franklin (1954, Chapter VII). The interested reader is advised to go over these 
references for a more thorough treatment of the topic. 


9.1 ONE-WAY FINITE POPULATION MODEL 


For a one-way classification, with a groups or a levels of factor A and n ob- 
servations per group, the mathematical model under finite population theory is 


H. Sahai et al., The Analysis of Variance 461 


CY Raa et Gian Re oy acne Molandia IN eke Vinee OOOO 
© Springer Science+Business Media New York 2000 


462 


The Analysis of Variance 


TABLE 9.1 

Analysis of Variance for Model (9.1.1) 

Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square 
Between a—1 SS MSz o7 +no2 
Within a(n — 1) SSw MSw oa? 

Total an —1 SSr 


the same as (2.1.1), namely, 


i=1,2,...,a 
Vij =U Qj + Gj; a (9.1.1) 


where y;; is the value of the j-th observation in the i-th group. However, now 
the assumptions are as follows: 


(i) As before, jz is the constant general effect. 


(ii) There is a population of effects due to factor A of size A with mean zero 


and variance o2. The a;’s are assumed to be a random sample of size a 

from this population. We denote a particular level of A in the population 

by a,, where J] = 1,2,..., A. The a@,’s in the population satisfy the 
woe A 

condition }°7_, a7 = 0. 


(iii) Sampling is random in each group and independent among different 


groups. The e;;’s are a random sample of size n from an infinite popu- 
lation with mean zero and variance 0 


2 


e' 


(iv) We make the following definitions of population variances, that is, the 


and 


variance components of the model (9.1.1), 


a: = E(e;.). 


For the finite population model (9.1.1), the entire analysis of variance, includ- 
ing the sums of squares, mean squares, and expected mean squares, remains 
the same and is summarized in Table 9.1. Thus, in Table 9.1, if A =a, the 
definition of a2 corresponds to the Model I case of Table 2.1; and if A = oo, it 
corresponds to the Model II case of Table 2.1. 


9.2 TWO-WAY CROSSED FINITE POPULATION MODEL 


For a two-way crossed classification, with factor A having a levels, factor B 
having b levels, and n replications per cell, the mathematical model under finite 


Finite Population and Other Models 463 


population is the same as (4.1.1); that is, 


i=z1,2,...,a 
Vijk = Uta; + Bj + (AB) + ei, YJ =1,2,...,0 (9.2.1) 
=] 2, Nn, 


where y,;, is the score of the k-th observation at the i-th level of factor A and 
the j-th level of factor B. However, underlying model (9.2.1), we now have the 
following assumptions: 


(1) 
(11) 


(iii) 


(iv) 


(Vv) 


(vi) 


As before, jz is the constant general effect. 

There is a population of main effects due to factor A of size A with mean 
zero and variance 02. The a@;’s are assumed to be a random sample of 
size a from this population. We denote a particular level of A in the 
population by w;, where / = 1,2,..., A. The a,’s satisfy the condition 
pa % = 0. 

There is a population of main effects due to factor B of size B with mean 
zero and variance op. The £;’s are assumed to be a random sample of 
size b from this population. We denote a particular level of B in the 
population by By where J = 1, 2,..., B. The B;’s satisfy the condition 
Dia py = 0. 

For each combination of a potential level of A with a potential level of B, 
there is a population of interaction effects of size A x B with mean zero 
and variance Olp. Selecting a particular J and a particular J determines 
the row and column and hence the cell that forms their interaction, and 
with this cell is associated the interaction (@B),,. The interaction terms 
satisfy the conditions: 


A 
>> (a@B);, =0, foreach J 
[=1 

and 


B 
Y > @B)11 = 0, for each J. 
J=1 


Sampling is at random in each cell and independent between different 
cells. The e;;,’s are arandom sample of sizen from an infinite population 
with mean zero and variance o?. 

We make the following definitions of the population variances, that is, 
the variance components of model (9.2.1): 


2 I : 2 
Oo =—— ) Or, 
7 A-1S! 

1 B 
2 _ 2 


1 A B 
2 9) 
P06 = (A — 1(B—1 dd, Bis 


I=1 J=1 


464 7 The Analysis of Variance 


TABLE 9.2 
Analysis of Variance for Model (9.2.1) 
Source of Degrees of Sum of Mean Expected 
Variation Freedom Squares Square Mean Square 
2 b\ 2 2 
Factor A a—1 SSA MS, of +nt1— 3 Oop + bno*, 
Factor B b-1 SSp MSs of + n( — A No +ano% 
Interaction (a—1)(b—1) SSAB MSaB of +NnoLg 
AxB 
Error ab(n — 1) SSE MS¢e a? 
Total abn — 1 SSr 
and 
2 2 


For the finite population model (9.2.1), the sums of squares and mean squares 
are the same as those shown in Table 4.2. However, the expected mean squares 
are those shown in Table 9.2. The derivation of expected mean squares involves 
some tedious algebra and can be found in Cornfield and Tukey (1956), Bennett 
and Franklin (1954, pp. 368-373), and Brownlee (1965, pp. 489-498). We 
simply mention here the results on the covariance structure of the a;’s, B;’s, 
and (wf); ;’s which are employed in finding expected mean squares. Thus, 


Covlai,a) = 22, i Xi 
Via;,,a) = —-——, 1 Y 
° A 
Op 
Cov(Bj,By)=—-z, JAI 
Cig . “pos + 
AB? iAl,jF#J 
1-—1/B 
Cov {(aB);;, (@B)ir} = oy ifi,j=s 
(1—1/A) , 


Sap i=i',jH#s’. 
Furthermore, because the a;’s and B;’s are selected independently, their covari- 
ances are zero. The values of (wf);;’s, on the other hand, in general depend on 
the i-th level of A and the j-th level of B and thus (@f);;’s are not independent 
of the w;’s and 8;’s in the sample. However, it can be shown that the covari- 
ance between a; and (@f)j'; is zero irrespective of whether i = i’; similarly, 


Finite Population and Other Models 465 


the covariance between 6; and (@f);; is zero. Thus, we obtain the following 
results: 


Cov(a;, Bj) = 0, 
Cov(a;, (@B)i;) = 9, 
Cov(q@;, (@B);";) = 0, 
Cov(B;, (@B)i;) = 0, 


and 
Cov(B;, (@B)ij/) = 0. 


TESTS OF HYPOTHESES 


The results on expected mean squares provide a valuable guide to deciding 
which mean squares in the analysis of variance table are to be compared. For 
example, from Table 9.2, it is seen that a hypothesis test is only available to 
test for the existence of interaction in the general case. To develop tests for the 
main effects, we must limit ourselves to certain special cases described in the 
following: 


(i) In Table 9.2, if the samples of factors A and B levels correspond to 
the entire population, so that A = a, B = J, then the expected values 
of the mean squares are those given in Table 4.2 for Model I and the 
finite population model becomes exactly Model I. Hence, the tests for 
the main effects are obtained by dividing the mean squares by the error 
mean square. 

(ii) If the levels of factors A and B are infinitely large so that 1 — a/A 
and 1 — b/B both tend to 1, then all covariances approach to zero 
and the expected values of the mean squares are given exactly as in 
Table 4.2 for Model II and the finite population model becomes exactly 
Model II. Hence, the main effects are tested against the interaction mean 
square. 

(iii) If the a;’s are samples from an infinite population and the £;’s are 
samples from the entire population (1.e., A = oo and B = b), then the 
expected values of the mean squares are given exactly as in Table 4.2 for 
Model III and the finite population model becomes exactly Model III. 
Thus, the factor A effect is tested against the error and the factor B 
effect is tested against the interaction mean square. 


In the following we give the details of F tests of the hypotheses of interest for 
the general case of the finite population model including the point and interval 
estimation of the parameters. 


466 The Analysis of Variance 


F Tests 
An F test of the hypothesis 


Hp? : Cos =0 versus H/?: Oop > 0 
can be performed by using the statistic MS,g/MSz_. Now, to test 
HE : Op =Q versus HP : Of > 0, (9.2.2) 


Since there is no other mean square with this expectation, an exact F test for 
the hypothesis (9.2.2) is not available. However, it 1s possible to determine a 
conservative or an approximate F test. 

To obtain a conservative test, note that when oF = Q, the quantity 


Me /|o3 +n — «ods 
SEUSS (9.2.3) 


MS az /(o2 + NO sp) 


has an F distribution with b — 1 and (a — 1)(b — 1) degrees of freedom, res- 
pectively. It cannot, however, be used for testing the hypothesis (9.2.2), since 
the expressions involving unknown parameters do not cancel out and therefore 
the statistic (9.2.3) cannot be evaluated. However, if we change the coeffi- 
cient of 02, from n(1 — 4) to n by neglecting a/A, then of + noZ, cancels 
out from both the numerator and the denominator, and (9.2.3) reduces to the 
statistic 


Fe = MSg/MSazs, 


which is readily evaluated. Note that in this way we have reduced the value of 
the statistic (9.2.3), and, thus, 1f we compare it with F [b — 1, (a — 1)(b — 1); 
1 — a], we have made it harder to reject the null hypothesis. Such a test is 
conservative, since if we try to construct a test with significance level a, we 
actually have one with level < a. When A is sufficiently large compared to 
a, this test is generally adequate. In addition, we can also develop a psuedo- F’ 
test for the hypothesis (9.2.2) by using a linear combination of the mean squares 
whose expected value is equal to o2 + n(1 — £)ogp- We note from Table 9.2 


Finite Population and Other Models 467 
that E[(1 — 4)MSyz + (4)MSz] = 02 + n(1 — 4)oZ,. Hence, the suggested 
F statistic 1s 
MSp 
1-2 \Msap+(2)Mse 
A AB A E 


which has an approximate F distribution with b — | and v, degrees of freedom, 
where v, is approximated by 


(1 * )MS +(<)Ms ) 
—~ AB ~~ E 
yea A, NAY OO (9.2.5) 


2 2 
(- <\ (MS4o) *) (MSr)" 
“Ta-1\b—-1) *~ abn—-l~ 


Fi, = (9.2.4) 


Analogous tests — conservative as well approximate — can be constructed for 
the hypothesis H;! : 02 = 0 versus H/ : o2 > 0. 


a 


POINT ESTIMATION 


The parameter pu is clearly estimated by 


The other parameters of interest are the variance components o?, ods , Of, and 
o2. From Table 9.2, on equating mean squares to their respective expected 


468 The Analysis of Variance 


values and solving for the variance components, we obtain 


6? = MSz, (9.2.8) 
; 1 
6p = — (MSaz — MS), (9.2.9) 
l a a 

a2 

— —]MS, —(1—— )MSaz — —MSe], 9.2.10 
of =| B ( =) AB | c ( ) 

and 
p2 _ | MS 1 D MS ° MS (9.2.11) 
«~~ bn| 4 B AB BET - 


INTERVAL ESTIMATION 


To find a confidence interval for 4, we note that there is no entry in the mean 
square column of Table 9.2 which is a multiple of Var(y,..) given by (9.2.6). 
Thus, we cannot find an exact confidence interval for 4. However, as earlier, 
we can determine an approximate confidence interval using the Satterthwaite 
procedure. Thus, from (9.2.6) and (9.2.7), we have 


~x*[v], 
o24n(1—2) (1-2) 02, 4an (1-2) o240n(1—-%)o2 
; B Op + a B 0% n a2 
where 
ms = ~2.ms 1-2) (,-2)ms.,+(1-2)Ms 
~ AB A B AB B 8 
a 
+ ( — + )MS« (9.2.12) 
and v is calculated from 
v= — aI ne —___(MSy TTI AON AT OND 
~ ¢ ab \* (MSg) a \2 b\2 (MSapy* b \2 (MSp) a\2(MSa)*° 
(a5) aba tna) (1-3) a 1Nb—1) (1-3) = (1-4) ai 
(9.2.13) 
Thus, 
yw TE approx ~ f¢[v], (9.2.14) 


Finite Population and Other Models 469 


and a 1 — @ level confidence interval for jz is given by 


y.. £t[v, 1 —a/2].MS/abn. (9.2.15) 


These limits are approximate because of the approximation involved in (9.2.14). 

Now, an examination of the mean square and expected mean square columns 
of Table 9.2 indicates which variance components or their functions can be 
easily estimated by a confidence interval. Thus, as in the case of the infinite 
population model, a 100(1 — w) percent confidence interval for o? is given by 


b(n — 1 b(n — 1)M 
__ab(n = MSE 2< _ a(n = DMS (9.2.16) 
x*[ab(n — 1), 1 — a@/2] xX*[ab(n — 1), @/2] 

Furthermore, 


(a — 1)(6 — 1)MSap 
5 © 7 Ea - 1b - 1) 

Oe + NO gg 
and a 1 — a level confidence interval for 02 + Node is given by 


(a — 1)(6 — 1)MSap 
x7[(a — 1)(6 — 1), 1 — @/2] 


(a — 1)(b — 1)MSap 


x7[(a — 1)(b — 1), a@/2) 
(9.2.17) 


2 2 
<0, TNOgg < 


Likewise 


(6 — 1)MSez 


ry rs nes x’ [b-1] 
o2 ma _ A ea + ano2 
e A ap B 


and a 1 — @ level confidence interval for 02 + n(1 — {ods + ano; is 
given by 


(b — 1)MSz ; a\ , »  (b—1)MS=, 
ee eas —— aa ate 
~Ib—1,1—-e/2] °° +n 5) 2 Fang < 71, a/2] 

(9.2.18) 


Similarly, a 1 — a@ level confidence interval for o2 + n(1 — 5 )oz + bno? is 
given by 


(a —_ 1)MS, 
x2[a — 1,1 —a/2] 


(a — 1)MS 4 
x*[a — 1, 0/2] 
(9.2.19) 


b 
<o2+n(1~ 3) 02, + no? < 


470 The Analysis of Variance 


Unfortunately, as noted in earlier chapters, the expressions in the expected 
mean square column are the only quantities for which we can find exact confi- 
dence intervals in this manner. Thus, there do not exist exact confidence intervals 
for O5p» Op» and a2, and we have to resort to some approximate results. There 
are various methods for obtaining approximate intervals (see, e.g., Bennett and 
Franklin, 1954, Chapter VII). In the following, we give a method for obtaining 
a conservative confidence interval in the sense that the confidence level is at 
least 1 — aw. Inasmuch as variance components are nonnegative, it is possible 
to obtain a conservative confidence interval by simply deleting the undesired 
terms (nuisance parameters) in the usual confidence intervals obtained by us- 
ing the chi-square table. For example, from (9.2.17), we can delete the term 
containing o2 and it will yield a conservative 100(1 — w) percent confidence 


interval for O58 as 


(a — 1)(6 — 1)MSap ) (a — 1)(b — 1)MSap 
ah) ~~ Fab © a he) a)" 
nx*[(a — 1)(6 — 1), 1 —a@/2] nx*[(a — 1)(6 — 1), @/2] 

(9.2.20) 


Similarly, from (9.2.18) and (9.2.19), one can obtain conservative 100(1 — a) 
percent confidence intervals for Of and o2 given by 


b — 1)MS b — 1)MS 
0 —ne __ <0} < _@— DMSe (9.2.21) 
anx2[b —1,1—a/2] anx?[b — 1, a/2] 
and 
— 1)MS — 1)MS 
(a — DMS 2 (a = DMs (9.2.22) 


$c rr < 
bnx2[a —1,1—a/2] * — bnx? [a — 1, a/2] 


9.3. THREE-WAY CROSSED FINITE POPULATION MODEL 


The finite population model for the replicated three-way crossed classification is 
a natural generalization of the two-way crossed finite population model (9.2.1). 
Thus, with factor A at A levels, factor B at B levels, factor C at C levels, and 
n replications per cell, the model equation may be written as 


i 
Yijke= MW +a; + Bj + Ve + (@B)ij +OVin + (BY) je | 
+ (@BY)i jx + Cijxe k= 
£ 


(9.3.1) 


where the terms on the right-hand side of equation (9.3.1) have the familiar 
meanings and correspond to the general mean; main effects due to factors 


Finite Population and Other Models 471 


A, B, C; interactions Ax B, Ax C, Bx C, Ax Bx C; and the error term, 
respectively. 

The assumptions of the finite population model (9.3.1) are stated in a way 
similar to those for the finite two-way model (9.2.1). Thus, for example, we 
suppose that the aw;’s are a random sample of size a from a population of size 
A, and the 8;’s and y,’s are random samples of sizes b and c from popula- 
tions of sizes B and C, respectively. In addition, in the population, the various 
parameters sum to zero over each index; that is, 


A B C 
d1= 2B = Di yK =0, 
Jal K=1 
C 


A A 
>> @B)1 =e = devin => @r)ik 


I=] J=1 I=] K=1 


B 
=) BY)ixK = » (BY ix =0, 
J=]1 


A 
> OBY)r7K ~S > (@BY ry = > (aByY) 7K = 9; 
J=1 K=1 


T=] 


(9.3.2) 


and we make the following definitions of the population variances, that is, the 
variance components of model (9.3.1): 


A 
a2 = qty da: etc., 


A B (9.3.3) 
Cap = ~ (A—1\(B—1) IVE —1) > » (af); ;, etc., 
and 
) ] A B C 
aby = Th DINB DC aD Dy Dy Le BNI 


The sums of squares and mean squares are the same as those shown in Table 5.2. 
The expected mean squares, which can be derived by an extension of the method 
used for the two-way model (see, e.g., Cornfield and Tukey (1956)), are dis- 
played in Table 9.3. 

Again, it is readily seen that when A = a, B = b, and C = c, the expected 
values of the mean squares are those given in Table 5.2 for Model I and the finite 
population model becomes exactly Model I. When A, B, and C are all infinite, 
then the expected values of the mean squares are those given in Table 5.2 for 
Model I and the finite model becomes exactly Model II. When A is infinite, 
B = b,andC =c, we have a Model III situation where factor A is random and 


472 The Analysis of Variance 


TABLE 9.3 
Expected Mean Squares for Model (9.3.1) 
Source Expected Mean Square 
b c c b 
A oa? raf — ale — = Jody, + on(1 — « )od, +en(1 — x )od + beno2 
B ao? +n(1 — “\(1 — = )odsy +an(1 — = )oRy + en(1 — A oi + acno% 
b b 
Cc a? +n(1 — “\(i — 7 ods, + an(1 — 7 oh, + on(1 — «od, + abno* 
c 
AxB oa? taf — « Jody, + cnoz. 
AxC 2a nl 1— La + bno2 
of +n B )Caby + oneay 
BxC 24n(1—-—)o2, + ano? 
Ve A) OobY ane By 


factors B and C are fixed; and the case when A and B are infinite and C = c, 
gives a Model III with factors A and B as random and factor C as fixed. 


9.4 FOUR-WAY CROSSED FINITE POPULATION MODEL 


The finite population model for the four-way classification is the obvious ex- 
tension of two and three-way finite population models and we survey it only 
briefly. Thus, with factor A at A levels, factor B at B levels, factor C at C levels, 
factor D at D levels, and n replicates per cell, the model is 


=1,..., 
Yijktm = M+ Oj + By + Ve + Oe + (OB) + OY Dix : =1,.. b 
+ (@d)ie + (BY) jx + (BS) je + Ve + OBY ijk Jp 1 
+ (@BS);5¢ + (AVS) ize + (BYS) ize + COBY 9S); jxe e= 1 : d 
+ Ci jkem m= - n 
(9.4.1) 


where the terms on the right-hand side of equation (9.4.1) have familiar mean- 
ings. For example, the a@;’s are a random sample of size a from a population of 
size A and sum to zero in the population; for example, 


A 
) a; = 0; 
I=1 


Finite Population and Other Models 473 


with corresponding results for the other main effects. Similarly, the two-way 
interactions sum to zero in the population over each index; for example, 


A B 
> (aB)77 = > (aB)7) = 0, 
I=1 J=1 


and so on. The three-way and four-way interactions also sum to zero in the 
population over each index; for example, 


A B C 
SY) @BY)raK = Yd @BY 1K = > (aBY rox = 9, 
I=1 J=!1 K=1 
and so on, and 
A B C 
Y-@BYS)yK1 = > BYS) KL =), OBYS KL 
I=] J=1 K=1 


D 
= >> @BYS) 1s KL = 0. 
L=1 


a 
The sums of squares and mean squares are the same as defined in Section 5.11. 


The expected mean squares can again be derived by an extension of the method 
used for the two- and three-way finite population models. Thus, for example, 


b Cc d 
E(MS,) =o2+n ( — i) (1 — ) (1 — 5] Obys 
b Cc b d 
+nd (1 —~ 7 (1 — « Jods + nc (1 — i) ( — 5) Oops 
Cc d b 
+ nb( — =) ( — 5) Oey 3 + ncd ( — i) OEg 


d 
+-nbd( _ = )o3, + nbc (1 — 5) o2,+nbcdo-, etc., 


Cc d Cc 
E(MS,z) = 0) + n( — =) (1 — 5] Oupys + nd(1 — = )o, 


d\ 4 2 
+ne}1— D ) Cab + ncdoyg, etc., 


2 —2 42 2 5; ; 
We define 07, O46, Spy» Sapys &tc., in an analogous manner as in (9.3.3). 


d 
E(MS,apzc) = oa? +n ( — 5) Cssys +ndoi5,, etc., 


and 


E(MSascp) = 0, + NO i py 5- 


474 The Analysis of Variance 


9.5 NESTED FINITE POPULATION MODELS 


Similar to the crossed classification models, we can develop finite population 
models for the nested classification. We briefly consider here the two-way nested 
finite population model. Thus, corresponding to the infinite population model 
(6.1.1), we have 


i=1,2,...,a 
Vijk =U +a; + Biiy +enijy 1 J =1,2,...,0 (9.5.1) 
k=1,2,...,n, 


9 


where the usual assumptions of the finite population model (9.5.1) are as fol- 
lows: 


(i) As before, pz is the constant general effect. 
(ii) There is a population of a,;’s of size A with ar a, =O and a;’s are 
random samples of size a from this population. 

(iii) Associated with each / is a population of By(s)’s of size B. For each of 
these populations of size B, we have the condition that vy Bia) =9 
foreach J = 1, 2,..., A. The Bj()’s are random samples of size b from 
these populations. It should be noted, however, that for each value of 
I, the entire set of B B,(;)’s sum to zero; but, in general, Bj(;)’s do not 
sum to zero for the sample b unless b = B. Also, the Bj,7)’s do not, in 
general, sum to zero within a row; that is, )-7_, By ¥ 0. 

(iv) The ex ;)’s are a random sample of size n from an infinite population 
with mean zero and variance o?. 

(v) We make the following definitions of the finite population variances or 

the variance components of model (9.5.1): 


and 
of = E(eaj)- 


Again, the sums of squares and mean squares are the same as those given in 
Table 6.2. However, the expected mean squares are those shown in Table 9.4. 
The derivation of the expected mean squares follows the same general approach 
of the crossed situation and can be found in Bennett and Franklin (1954, pp. 
358-363). Note that in Table 9.4 if both factors A and B constitute the en- 
tire population (i. e.. A = a and B = Db), then the expected values of the 
mean squares become identical to those given in Table 6.2 for Model I. If both 


Finite Population and Other Models 475 


TABLE 9.4 
Analysis of Variance for Model (9.5.1) 
Source of Degrees of Sumof Mean Expected 
Variation Freedom Squares Square Mean Square 
b 
Factor A a—1 SSA MS, a2 +n ( - 7 og +bnoz 


Factor B within A a(b — 1) SS B(A) MS aa) oa? + nok 


Error ab(n—1) SSg MSe a? 
Total abn — | SSr 


A = 00, B = ov, we get Model IJ; and the case with A = a and B = & gives 
Model III. 

The finite population model (9.5.1) can be extended similarly to higher-order 
nested classifications. 


9.6 UNBALANCED FINITE POPULATION MODELS 


In the preceding sections, we have dealt mainly with finite population models 
having balanced sampling. The details on the models involving unequal sam- 
pling can be found in the papers of Gaylor and Hartwell (1969) and Searle and 
Fawcett (1970). 


9.7 WORKED EXAMPLE FOR A FINITE POPULATION MODEL 


Consider an industrial experiment involving 3 machines and 4 operators. Ma- 
chines were randomly selected from a set of 10 machines and operators were 
chosen at random from a group of 12 available operators. Three observations 
were made on each of the 12 machine-operator combinations and the data on 
production output are given in Table 9.5. 

This is an example of a two-way crossed finite population model with replica- 
tion where both machines and operators are randomly selected from populations 
involving only a finite number of elements. The mathematical model for this 
experiment would be 


l 
Yijk =U1+G+ Bj) + (OB)ij +e. YJ = 1,2, 
k=1,2 


where yp is the constant general effect, a; is the effect of the i-th machine, 6; is 
the effect of the j-th operator, (w);; is the interaction effect of the i-th machine 
with the j-th operator, and e;;, is the customary error term. Furthermore, it is 


476 The Analysis of Variance 


TABLE 9.5 
Production Output from an 
Industrial Experiment 


Operator 
Machine 1 2 3 4 
1 26.3 26.0 25.7 25.0 


26.9 25.2 26.0 25.3 
27.2 24.6 26.2 25.0 
2 26.7 26.0 26.2 25.5 
27.0 26.4 26.1 24.7 
26.9 26.6 27.3 26.0 
3 26.8 26.6 26.5 26.2 
27.0 27.0 27.5 25.5 
27.2 26.9 27.0 25.7 


assumed that the a@;’s are a random sample of size 3 from a population of a,’s 
that satisfy the condition Sry a; =O; B;’s are a random sample of size 4 
from a population of B,’s that satisfy the condition yy B,; =90; (@B);;’s 
are arandom sample from a population of (wB),,’s that satisfy the conditions 
Spy (aB);; =O for each J and Wy (@B);, =0, for each J; and e;;,’s are 
a random sample of size 3 from an infinite population with mean zero and 
variance oa. The population variances of the a;’s, B;’s, and (wB),;’s are defined 
as follows: 


2 | ° 2 
"a * nt Yh 
2 ! - 2 
B = 12-1 2 Fi, 
and 
l 10 12 
8 = GOD DIO Le 2 Bis: 


I=1 J=1 


The analysis of variance computations for degrees of freedom, sums of squares, 
and mean squares are performed as in the case of an infinite population model 
and the results are shown in Table 9.6. 

Assuming normality we can test the hypotheses of interest using the results 
shown in Table 9.6. The results on expected mean squares provide a valuable 
guide to deciding which mean squares are to be compared. The interaction 
hypothesis Hj’? : oj, = 0 versus H;*" : 07, > is tested by comparing the ratio 


Finite Population and Other Models 477 


TABLE 9.6 
Analysis of Variance for the Production Output Data of Table 9.5 
Source of Degreesof Sumof Mean Expected 
Variation Freedom Squares Square Mean Square 

4 
Machine 2 4.6250 2312 o2+4+3 (1 - =) Oig +4 x 302 
Operator 3 10.3364 3.445 o2+3 1-4 ),2 + 3x 302 
Interaction 6 1.6261 0.271 of +30f, 
Error 24 4.5000 0.188 2 
Total 35 21.0875 


0.271 /0.188 = 1.44 with the percentile of the theoretical F distribution with 
(6, 24) degrees of freedom which is not significant (p= 0.241). To test the 
hypothesis regarding the presence of a main effect due to operator (i.e., Hy : 
o2 =0 versus H? Of > 0), we notice that there does not exist an exact F 
test. However, as noted in Section 9.2, a conservative test can be performed 
by comparing the ratio 3.445/0.271 = 12.71 with the percentile of the theoret- 
ical F distribution with (3, 6) degrees of freedom which is highly significant 
(p =0.005). Similarly, a conservative test of the hypothesis regarding the pres- 
ence of a main effect due to machine (i.e., Hj’ : o2 = 0 versusH/' : o2 > 0), 
is performed by comparing the ratio 2.312/0.271 = 8.53 with the percentile 
of the theoretical F distribution with (2, 6) degrees of freedom which is more 
significant than the 2 percent level of significance (p = 0.017). In addition, we 
can also perform psuedo- F tests for the hypotheses considered previously. As 
discussed in Section 9.2, a psuedo- F test for the hypothesis Hy’ Oo, = 0 versus 
HP: Op > 0 is performed using the statistic 


MS; 
1—2)ms,,+(2)ms, 
A AB A E 


which has an approximate F distribution with 3 and v, degrees of freedom 


where 
a 2 
((1— 3) Sant (5) se) 
yp A NEY 
(1-4 (MSan)? (—) (MS;) 
A AB A E 


(a — 1)(b— 1) + ab(n — 1) 


Fi, = 


>| 8 


478 The Analysis of Variance 


In the example at-hand, F;, and v, (rounded to the nearest digit) are found to 
be 


3.445 
Fi, = ———____—""______ = 14.00 


3 3 
( = a) (0.271) + (=) (0.188) 


and 


((1- 3) 0.271 +(3) 0.188)) 


3) 2 (2 2 
( 3} (0.271) (3) (0.188) 
rs a 


/ 
Vv, = 


These values lead to essentially the same result as the conservative test obtained 
earlier with even higher significance (p < 0.001). Similarly, a psuedo-F test 
for the hypothesis H;! : o2 = 0 versus o7 > 0 is determined using the statistic 


MS, 
b b . 
1——)MSa5+(—)Ms 


which has an approximate F distribution with 2 and v, degrees of freedom 


where 
b b 2 
1——1]MS — |}MS 
y= (( 3) o+(5) ‘) 


a 


F,= 


b \’ b\’ 
( — 2 (MSap) B (MSz) 
“Ta=Db—-1) + abn—h~ 


Again, in the example at-hand, the values of F’, and v/, (rounded to the nearest 
digit) are found to be 


2.312 
Fi, = ———____"""___. = 9.50 


4 4 
(1 — =) (0.271) + (5) (0.188) 


Finite Population and Other Models 479 


and 


1-+)o27 * ) 0.188 
(Qa) 027 + (a3) 0288) 
v= : =11. 


,- 4 “271 4 0.188) 
( 5 ) 5) 
ny <r a) 


These values also lead to essentially the same result as the conservative test 
obtained earlier with even higher significance (p = 0.004). 

Now, to assess the relative contribution of individual variance components, 
we may obtain their estimates using formulae (9.2.8) through (9.2.11). Thus, 
we find that 


6? = 0.188, 


l 
65, = — (0.271 — 0.188) = 0.028, 
13 


I 3 3 
2 

= —— |3.445 — (1 —- — } (0.271)— {| — ] 0.1 = 0.355, 
p | ° ( ak Gal 88) ° 


> 


and 


52 | 2.312 — 12 (0.271) — 4 (0.188) | = 0.172 
4x3 [7 12) °° 12) >" oe 


These components account for 25.3, 3.8, 47.8 and 23.1 percent of the total 
variation. The results are consistent with the tests of hypotheses performed 
earlier. 

We can further proceed to obtain confidence intervals for the variance com- 
ponents. To determine a 95 percent confidence interval for a2, we have 


MS, = 0.188, 7[24,0.025] = 12.397, and yx7[24,0.975] = 39.980; 


Substituting the values in (9.2.16), the desired 95 percent confidence interval 
for o2 is given by 


aS > 24x 0.188 


<of < ————- | = 0.95 
39.380 12.397 


Or 


P [0.114 < of < 0.363] = 0.95. 


480 The Analysis of Variance 


As noted in Section 9.2, there do not exist exact confidence intervals for 02 3° OR ; 
and o2; however, we can obtain their conservative confidence intervals. Using 
formulae (9.2.20) through (9.2.22), it can be verified that the conservative con- 
fidence intervals for Oxp> OR and a2 are given as follows: 


P[0.038 < of, < 0.437] > 0.95, 
P[0.123 < of < 5.220] > 0.95, 


and 
P[0.052 < a2 < 7.707] > 0.95. 


It should be remarked that the variance components are in general highly vari- 
able and the lengths of their confidence intervals given previoulsy attest to the 
fact that there is great deal of uncertainty involved in their estimates. 

Finally, we construct a confidence interval for the general constant yw. As 
noted earlier, there does not exist an exact confidence interval for this problem. 
However, we can obtain an approximate 95 percent confidence interval for 


jl as 
_ IMS 
abn 


where MS and v are defined as in (9.2.12) and (9.2.13), respectively. For the 
example at-hand, y.., MS, v (rounded to the nearest digit), and t[v, 0.975] are 
found to be 


y. = 26.242, 


Ms = (75) 0.188) - (1 a)(1-3) 0.271 
= \ Fo x12 } O18) ~ 10 ja ) 0-471) 


4 3 
+ (1 — =) (3.445) + (1 — =) (2.312) = 3.807, 


3x4 3 4 
3x4 \? ; 3 4 , 4 ; 3 
(5) (0.188) (1-3) ae (0.271) ay (3.445) = (2.312) 
HH Ht 


24 6 3 2 


t[5, 0.975] = 2.571. 


Finite Population and Other Models 481 


Substituting these values in the preceding formula the desired 95 percent con- 
fidence limits for 4 are determined as 


[3.807 
26.242 + 2.571 36.7 (25.406, 27.078). 


9.8 OTHER MODELS 


Throughout this volume, we have been concerned mainly with such terms as 
Models I, II, and III, or fixed, random, and mixed models, depending on the 
nature of the factors in the experiment. For Model I, it was assumed that for all 
factors the levels employed in the experiment make up the population of levels 
of interest. When the levels used in the experiment constitute a sample from an 
infinite population of levels, Model II is appropriate. A case involving at least 
one factor fixed and others random was termed as Model III. 

In the preceding sections of this chapter, we have considered the so-called 
finite population models, in which the error terms are assumed to be random 
variables from an infinite population; but the levels of the factors are assumed to 
be random samples from a finite population of levels, and use is made of the fact 
that the variance of the mean of a random sample of n from a finite population 
of size N with variance o? is given by (1 — nyo The extra factor (1 — 7) is 
known as the finite population correction. If we let f = n/N, the finite popula- 
tion correction is 1 — f. In this way, the tables of the expected values of the mean 
squares for various crossed and nested classifications were readily obtained. 

Tukey (1949a) emphasized the restrictiveness of these models and proposed 
to extend the range by defining more complex models. These models have 
received very little attention in statistical literature, except in some theoretical 
works. It is not possible to provide any further discussion on this topic here. 
Plackett (1960) presents an excellent review of many of these models. 


9.9 USE OF STATISTICAL COMPUTING PACKAGES 


The use of SAS, SPSS, and BMDP programs for analyzing finite population 
models is the same as described in earlier chapters for crossed, nested, and par- 
tially nested factors. The computations of degrees of freedom, sums of squares, 
and mean squares as obtained earlier also remain valid for the finite population 
models. However, the expected values of mean squares must be provided using 
the results outlined in this chapter. The results on tests of hypotheses, point 
estimates, and confidence intervals can then be obtained using procedures de- 
veloped in this chapter. 


EXERCISES 


1. Consider a two-way crossed finite population model involving three 
varieties of wheat and 3 different fertilizers. The three varieties of 


482 The Analysis of Variance 


wheat are selected randomly from a finite population of nine varieties 
of interest to the experimenter. Similarly, three fertilizers are taken at 
random from a finite population of twelve fertilizers available for the 
experiment. The data on yields in bushels/acre are given as follows. 


Fertilizer 
Variety 1 2 3 


I 60 52 65 
61 50 66 
62 58 68 
il 75 60 TI 
719 61 72 
77 62 73 
Hl 76 59 74 
77 60 75 
78 61 77 


(a) State the mathematical model and the assumptions for the ex- 
periment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are significant interaction effects among va- 
rieties and fertilizers. Use aw = 0.05. 

(d) Perform conservative and psuedo-F tests to determine whether 
there are differences in yields among the varieties of wheat. Use 
a = 0.05. 

(e) Perform conservative and psuedo-F tests to determine whether 
there are differences in yields among different fertilizers. Use 
a = 0.05. 

({) Estimate the variance components of the model. 

(g) Determine an exact 95 percent confidence interval for the error 
variance component. 

(h) Determine approximate 95 percent confidence intervals for other 
variance components of the model. 

(1) Determine an approximate 95 percent confidence interval for the 
general mean jz using the Satterthewaite procedure. 


Some Simple 
Experimental Designs 


10.0 PREVIEW 


In the previous chapters we developed techniques suitable for analyzing experi- 
mental data. It is important at this point to consider the manner in which the 
experimental data were collected as this greatly influences the choice of the 
proper technique for data analysis. If an experiment has been properly designed 
or planned, the data will have been collected in the most efficient manner for the 
problem being considered. Experimental design is the sequence of steps initially 
taken to ensure that the data will be obtained 1n such a way that analysis will lead 
immediately to valid statistical inferences. The purpose of statistically designing 
an experiment is to collect the maximum amount of useful information with a 
minimum expenditure of time and resources. It is important to remember that 
the design of the experiment should be as simple as possible consistent with 
the objectives and requirements of the problem. The purpose of this chapter 
is to introduce some basic principles of experimental design and discuss some 
commonly employed experimental designs of general applications. 


10.1 PRINCIPLES OF EXPERIMENTAL DESIGN 


Three basic principles in designing an experiment are: replication, random- 
ization, and control. The application of these principles ensures validity of the 
analysis and increases its sensitivity, and thus they are crucial to any scientific 
experiment. We briefly discuss each of these principles in the following. 


REPLICATION 


The first principle of a designed experiment is replication, which is merely a 
complete repetition of the basic experiment. It refers to running all the treatment 
combinations again, at a later time period, where each treatment 1s applied to 
several experimental units. It provides an estimate of the magnitude of the 
experimental error and also makes tests of significance of effects possible. 


RANDOMIZATION 


The second principle of a designed experiment is that of randomization, which 
helps to ensure against any unintentional bias in the experimental units and/or 
H. Sahai et al., The Analysis of Variance 483 


* 
) 
l 


GC) CaAeingarkoian ce: Rivcineace NAeaaae Naver VATE DONNA 
© Springer Science+Business Media New York 2000 


484 The Analysis of Variance 


treatment combinations and can form a sound basis for statistical inference. 
Here, an experimental unit is a unit to which a single treatment combination is 
applied in a single replication of the experiment. The term treatment or treat- 
ment combinations means the experimental conditions that are imposed on an 
experimental unit in a particular experiment. If the data are random, it is safe to 
assume that the experimental errors are independently distributed. However, er- 
rors associated with the experimental units that are adjacent in time or space will 
tend to be correlated, thus violating the assumption of independence. Random- 
ization helps to make this correlation as small as possible so that the analysis 
can be carried out as though the assumption of independence were true. Fur- 
thermore, it allows for unbiased estimates and valid tests of significance of the 
effects of treatments. In addition, although many extraneous variables affecting 
the response in a designed experiment do not vary ina completely random man- 
ner, it is reasonable to assume that their cumulative effect varies in a random 
manner. The randomization of treatments to experimental units has the effect of 
randomly assigning the error terms (associated with experimental units) to the 
treatments and thus satisfying the assumptions required for the validity of sta- 
tistical inference. The idea was originally introduced by Fisher (1926) and has 
been further elaborated by Greenberg (1951), Kempthorne! (1955, 1977), and 
Lorenzen (1984). There are a number of randomization methods available for 
assigning treatments to experimental units (see, e.g., Cochran and Cox (1957); 
Cox (1958a)). 


CONTROL 


The third principle of a designed experiment is that of control, which refers 
to the way in which experimental units in a particular design are balanced, 
blocked, and grouped. Balancing means the assignment of the treatment com- 
binations to the experimental units in such a way that a balanced or systematic 
configuration is obtained. Otherwise, it is unbalanced or we simply say that 
there are missing data. Blocking is the assignment of experimental units to 
blocks in such a manner that the units within a particular block are as ho- 
mogeneous as possible. Grouping refers to the placement of homogeneous 
experimental units into different groups to which separate treatments may be 
assigned. Balancing, blocking, and grouping can be achieved in various ways 
and at various stages of the experiment and their choice is indicated by the 
availability of the experimental conditions. The application of control results 
in the reduction of experimental error, which in turn leads to a more sensitive 
analysis. 

Detailed discussions of these and other principles involved in designing an 
experiment can be found in books on experimental design (see, e.g., Cochran 
and Cox (1957); Cox (1958a)). In the succeeding sections we discuss some sim- 
ple experimental designs for general application. Complex designs employed 


! Kempthorne (1977) stresses the necessity of randomization for the validity of error assumptions. 


Some Simple Experimental Designs 485 


in many agricultural and biomedical experimentation are not considered here. 
The reader is referred to excellent books by Federer (1955), Cochran and Cox 
(1957), Gill (1978), Fleiss (1986), Hicks (1987), Winer et al. (1991), Hinkel- 
man and Kempthorne (1994), Kirk (1995), Steel et al. (1997), among others, 
for a discussion of designs not included here. Also, complete details of math- 
ematical models and statistical analysis are not given here since they readily 
follow from the same type of statistical models and the principle of partitioning 
of the sum of squares described in full detail in earlier chapters. 


Remark: There is a voluminous literature on experimental design and many excellent 
sources of reference are currently available. Herzberg and Cox (1959) have given bibli- 
ographies on experimental designs. Federer and Balaam (1972) provided an exclusive 
bibliography on designs (the arrangement of treatment in an experiment) and treatment 
designs (the selection of treatments employed in an experiment) for the period prior to 
1968. Federer and Federer (1973) presented a partial bibliography on statistical designs 
for the period 1968 through 1971. Federer (1980, 1981a,b) in a three-part article gave a 
bibliography on experimental designs from 1972 through 1978. For recent developments 
in design of experiements, covering the literature of 1975 through 1980, see Atkinson 
(1982). For an annotated bibliography of the books on design of experiments see Hahn 
(1982). 


10.2; COMPLETELY RANDOMIZED DESIGN 


In a completely randomized design, the treatments are allocated entirely by 
chance. In other words, all experimental units are considered the same and no 
division or grouping among them exists. The design is entirely flexible in that 
any number of treatments or replications may be used. The replications may 
vary from treatment to treatment and all available experimental material can 
be utilized. Among other advantages of this design include the simplicity of 
the statistical analysis even for the case of missing data. The relative loss of 
information due to missing data is less for the completely randomized design 
than for any other design. 

In a completely randomized design all the variability among the experimen- 
tal units goes into the experimental error. The completely randomized design 
should be used when the experimental material is homogeneous or missing val- 
ues are expected to occur. The design is also appropriate in small experiments 
when an increase in accuracy from other designs does not outweigh the loss 
of degrees of freedom due to experimental error. The main disadvantage to the 
completely randomized design is that it is often inefficient. 


MODEL AND ANALYSIS 


If we take n; replications for each treatment or treatment combination in a com- 
pletely random manner, then the analysis of variance model for the experiment 


486 The Analysis of Variance 


TABLE 10.1 
Analysis of Variance for the Completely Randomized Design with 
Equal Sample Sizes 


Expected Mean Square 
Source of Degreesof Sumsof Mean 


Variation Freedom Squares Square Model I Model II F Value 


a 
n t? 
Treatment a-—1 SS; MS, o7+—=)— o2+no2  MS,/MS¢ 
Error a(n — 1) SSE MSe o@? 
Total an — | SSr 
is given by 
i=1,2,...,a 
Vij = Ut + ei; {i= 1,2,....m, (10.2.1) 


where y;; 1s the j-th observation corresponding to the i-th treatment, —oo < 
jt < 00 1s the general mean, 7; is the effect due to i-th treatment, and e;; is the 
error associated with the i-th treatment and the j-th observation. As before, the 
assumptions inherent in the model are linearity, normality, additivity, indepen- 
dence, and homogeneity of variances. Clearly, model (10.2.1) 1s the same as 
the one-way classification model (2.1.1) and the appropriate analysis will be 
the one-way analysis of variance as described in Chapter 2. 

There are two models associated with the model equation (10.2.1). Model Tis 
concerned with only the treatments present in the experiment and under Model II 
treatments are assumed to be a random sample from an infinite population 
of treatments. Model I requires that ar n;t; = 0 and under Model II the 
T;’S are assumed to be normal random variables with mean zero and variance 
o”. The steps in the analysis of this model are identical to that discussed in 
Chapter 2. The complete analysis of variance for the balanced case (1.e., when 
ny = nz = +--+ = Ng =n) 18 shown in Table 10.] and that for the unbalanced 
case in Table 10.2. 

The hypothesis of interest under fixed effects or Model I is 


Ao: %] =™m=-:-=%=O0 
versus (10.2.2) 
H,: at least one 1; 4 0. 


In Model II, we are still interested in the hypothesis of no treatment effects; 
however, the 1T;’s are random variables with mean zero and variance o?. In this 
case the hypothesis of no treatment effects is 


Ho:o2 =0 versus M:02 > 0. (10.2.3) 


Some Simple Experimental Designs 487 


TABLE 10.2 
Analysis of Variance for the Completely Randomized Design with 
Unequal Sample Sizes 


Expected Mean Square 
Source of Degreesof Sumsof Mean 


Variation Freedom Squares Square Model | Model II* F Value 


a 
yo nit; 

Treatment a-—1 SS; MS, o2 + a o2 + noe MS,/MSe 

a 
Error > nj —a SSE MSE oa? oa? 

i=] 

a 

Total yoni -1 SSr 


The statistic F = MS, /MSgz, whichhas an F distribution witha—1 anda(n—1) 
Oran n; — a, for the unbalanced case) degrees of freedom, is used to test the 
hypothesis (10.2.2) or (10.2.3). A more general hypothesis on a? may be of the 
form 


,. 272 ,. 2722 
Hj): 07/0; <p. versus Hj: 07/0; > po, 


where p, is a specified value of p, = o2/a2. As in (2.7.11), this hypothesis is 
tested by the statistic (1 +-np,)~!(MS,/MSz) which has an F distribution with 
a — 1 and a(n — 1) degrees of freedom. 

For the estimation of the variance components 0? and a7, which are of interest 
under Model II, we can, as before, employ the analysis of variance procedure. 
The estimators thus obtained are given by 


52 = MSz 


and ; (10.2.4) 
6° = —(MS, — MSz). 
n 


For all other details of the analysis of the model (10.2.1), refer to Chapter 2. 


WoRKED EXAMPLE 


Fisher (1958, p. 262) reported data on the weights of mangold roots collected by 
Mercer and Hall in a uniformity trial with 20 strips of land using a completely 
randomized design to test five different treatments each in quadruplicate. The 
data are given in Table 10.3. 


488 The Analysis of Variance 


TABLE 10.3 
Data on Weights of Mangold Roots in 
a Uniformity Trial 


Treatment 
A B C D E 
3376 3504 3430 3404 3253 
3361 3416 3334 3210 3314 
3366 3244 3291 3168 3287 


3330 3195 3029 3118 3085 


Source: Fisher (1958, p. 262). Used with permission. 


TABLE 10.4 

Analysis of Variance for the Data on Weights of Mangold Roots 
Source of —_ Degrees of Sums of Mean 

Variation Freedom Squares Square F value p-value 
Treatment 4 58,725.500 14,681.375 0.95 0.461 
Error 15 231,040.250 15,402.683 

Total 19 289,765.750 


The analysis of variance calculations are readily performed and the results are 
summarized in Table 10.4. The outputs illustrating the applications of statistical 
packages to perform the analysis of variance are presented in Figure 10.1. Here, 
the ratio of mean squares is not significant (p = 0.461) and the conclusion 
would be that there are no significant differences among the treatments. 


10.3 RANDOMIZED BLOCK DESIGN 


If the experimental units are divided into a number of groups and a complete 
replication of all treatments is allocated to each group, we have the so-called ran- 
domized complete block design. The randomized block design was developed 
by Fisher (1926). The randomization is carried out separately in each group of 
experimental units, which is usually designated as a block. Here, an attempt is 
made to contain the major variations between blocks so that the experimental 
error in each group is relatively small. Thus, the blocks may be constructed 
so as to coincide with the degree of variability in experimental material. For 
example, in agricultural experimentation, each observation of, say, yield, comes 
from a plot of land, and we may group adjacent plots that are relatively homo- 
geneous to form a block. In executing the experiment, we randomly allocate 
the treatments to the plots in the first block and then repeat the randomization 
for the second and other remaining blocks. 


Some Simple Experimental Designs 489 


DATA MANGOLD; The SAS System 

INPUT TRTMENT $ WEIGHT; Analysis of Variance Procedure 

DATALINES; 

A 3376 Dependent Variable: WEIGHT 

fA 3361 Sum of Mean 

Source DF Squares Square F Value Pr> F 


E 3085 
; Model 4 58725.5000 14681.3750 0.95 0.4610 
PROC ANOVA; Error 15 231040. 2500 15402. 6833 
} CLASSES TRTMENT; Corrected 19 289765.7500 
] MODEL WEIGHT=TRTMENT; Total 
RUN; R-Square C.V. Root MSE WEIGHT Mean 
CLASS LEVELS VALUES 0.202665 3.7771452 124.10755 3285.7500000 
TRTMENT 5 ABCDE 
NUMBER OF OBS. IN DATA — Source DF Anova SS Mean Square F Value Pr > F 
} SET=20 TRTMENT 4 58725.50 14681.37 0.95 0.4610 


(i) SAS application: SAS ANOVA instructions and output for the completely random- 
ized design. 


DATA LIST Test of Homogeneity of Variances 
/TRTMENT 1 

WEIGHT 3-6 Levene 

BEGIN DATA. Statistic dfl af2 Sig. 
a 1 3376. WEIGHT 1.940 4 15 .156 
11 3361. 

1 3366. 

1 3330. ANOVA 


5 3085. Sum of Squares df Mean Square F Sig. 
END DATA. 

| ONEWAY WEIGHT BY WEIGHT Between Groups 58725.500 4 14681.375 -953 .461 
TRTIMENT (1,5) Within Groups 231040.250 15 15402.683 
/STATISTICS=ALL. Total 289765.750 19 


(11) SPSS application: SPSS ONEWAY instructions and output for the completely ran- 
domized design. 


FILE='C: \SAHAI BMDP7D - ONE- AND TWO-WAY ANALYSIS OF VARIANCE WITH 
\TEXTO\EJE23.TXT'. DATA SCREENING Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=2. JANALYSIS OF VARIANCE TABLE FOR MEANS 
/VARIABLE NAMES=TRT,WEIGHT. | |SOURCE SUM OF SQUARES DF MEAN SQUARE F VALUE PROB. | 
/GROUP CODES (TRT)=1, 2,3, 
4,5. | TRTMENT 58725.5000 4 14681.3750 0.95 
NAMES (TRT)=A,B,C, | |ERROR 231040.2500 15 15402.6833 
D,E. 

| /HISTOGRA GROUPING=TRT. EQUALITY OF MEANS TESTS; 
VARIABLES=WEIGHT. VARIANCES ARE NOT ASSUMED TO BE EQUAL 

| /END | WELCH 

1 3376 | BROWN- FORSYTHE 
1 3361 . 


5 3085 


(iii) BMDP application: BMDP 7D instructions and output for the completely random- 
ized design. 


FIGURE 10.1 Program Instructions and Output for the Completely Randomized 
Design: Data on Weights of Mangold Roots in a Uniformity Trial (Table 10.3). 


490 The Analysis of Variance 


To illustrate the layout of a randomized block design, let us consider eight 
treatments, say, 7), 72, ..., Tg, corresponding to eight levels of a factor to be 
included in each of five blocks. Figure 10.2 shows such an experimental layout. 
Note that the treatments are randomly allocated within each block. It is evident 
that this layout is quite different from the completely randomized experiment 
where there will be a single randomization of eight treatments repeated five 
times to 40 plots. 


FIGURE 10.2 A Layout of a Randomized Block Design. 


MODEL AND ANALYSIS 


The analysis of variance model for a randomized complete block design with 
one observation per experimental unit is given by 


i=1,2,...,b 
Vij =UMA+B +7 + ei; ' 12... (10.3.1) 


where y;; denotes the observed value corresponding to the i-th block and the 
j-th treatment; —oo < yt < oo is the general mean, f; is the effect of the 
i-th block, t; is the effect of the j-th treatment, and e;; 1s the customary error 
term. Clearly, the model (10.3.1) is the same as the model equation (3.1.1) 
for the two-way crossed classification with one observation per cell. Thus, 
the analysis is identical to that discussed in Chapter 3 with the only differ- 
ence that the factor A now designates “blocks” and the factor B denotes 
“treatments.” There are three versions of model (10.3.1) (i.e., Models I, II, and 
III) depending on whether blocks or treatments or both are chosen at random. 


Both Blocks and Treatments Fixed 
In a randomized block experiment, both blocks and treatments may be fixed. 
In this case, the B;’s and t;’s are fixed constants with the restrictions that 


b 
yA =0- 
i=l 


and the e;;’s are normal random variables with mean zero and variance o2. The 
analysis of variance in Table 3.2 can now be rewritten in the notation of model 
(10.3.1) as shown in Table 10.5. 


t 
Tj, 
j=l 


491 


Some Simple Experimental Designs 


SS 
20 70 79 SW Ss 
If, Ey 
IGn/*sw fa tee mgt 0 ¢ {7 te SIN 256 
i 1 
|=! I-4q 
ISw/4sw foit+ jo foit7o ly”< —— +70 4S A¢S 
q 
an[eA 4 It |aPOW Il |aPOW | [apOW asenbs — sauenbs 


uraw =—s« JO. wing 
asenbs uraw pajydadx 


I — 4q [BIOL 
(I - A — 9) 101g 


[—J  uowjeay 
1-4 Old 


wopaal4 UOI}ELIeA 
jo Saai39q JO 394Nn0S 


USISAG 490]g paziMOpueY 34} 10) JdURLIeA Jo SIsAjeUY 


SOL ATaVL 


492 The Analysis of Variance 


Here, the hypothesis 


Ajp:%y =m=::- = =0 
versus (10.3.2) 
H;:t; # 0 for at least one j, j = 1,2,...,1 


is of primary interest and is tested by the statistic 


_ MS, 


fF, = ; 
MS=e 


If F, > F(t —1,(b— 1) — 1); 1 —a@], then Hp will be rejected and the con- 
clusion is that there are significant differences among the treatments. 
The hypothesis 


Hy: Bi =p, =--- =p, =0 
versus (10.3.3) 
H??: B; # 0 for at least one i,i = 1,2,...,b, 


although of minor importance, may be tested in a similar manner by the statistic 


_ MS, 
~~ MS," 


Fp 


However, due to the manner in which the experiment 1s set up, the hypothesis 
(10.3.3) should not be tested except as acheck on the blocking of the experiment. 
The whole purpose of a randomized block design is to reduce experimental 
error and get a more efficient test of (10.3.2). Therefore, if the statistic Fg is 
nonsignificant, there is strong evidence of improperly carried out blocking. In 
that case, the entire experiment should be repeated with more careful attention 
to the assignment of the treatments to the experimental units.” 


Both Blocks and Treatments Random 

In arandomized block experiment, both blocks and treatments may be randomly 
chosen, and then we will have a Model II or arandom model. Here, all B;’s, t;’s, 
and e;;’s are mutually and completely independent normal random variables 
with mean zero and variances Op, o7, and o?, respectively. The analysis of 


variance in this case is also given by Table 10.5. The hypotheses on oR and 


o? can be tested by the same statistics as in the case when both blocks and 


treatments are fixed. 
Blocks Random and Treatments Fixed 
In a randomized block experiment, the blocks may be chosen randomly from a 


population of blocks, but the treatments may be fixed. In this case, we will have 


2 For further discussions of this issue, see Lentner et al. (1989) and Samuels et al. (1991). 


Some Simple Experimental Designs 493 


a Model III or mixed model, where t;’s are fixed constants with the restriction 
that 


t 
) Tj = 0; 
j=l 


and the £;’s and e;;’s are mutually and completely independent normal random 
variables with mean zero and variance oR and a2, respectively. The analysis of 
_variance in this case is again given by Table 10.5. The hypotheses about 8;’s 
and t;’s can similarly be tested as in the case when both blocks and treatments 
are fixed.. 


Blocks Fixed and Treatments Random 

In a randomized block experiment the treatments may be chosen at random 
from a population of treatments, but the blocks may be fixed. In this case, we 
again have a mixed model situation. The assumptions and tests of hypotheses 
are as given in the preceding case with the roles of the 6;’s and T;’s being 
reversed. 


Remark: It should be noticed that if block effects had been ignored in the analysis, the 
analysis of variance would be the same as shown in Table 10.5, except that now the block 
and residual sum of squares would be pooled giving an error sum of squares equal to 
SSz + SS_ with b(t — 1) degrees of freedom. Thus, the test for the hypothesis (10.3.2) 
would be inefficient since all the variation between blocks has been lumped with the 
experimental error. Furthermore, note that the analysis of variance model (10.3.1) for a 
randomized complete block design looks identical to the two-way crossed classification 
model (3.2.1) with one observation per cell. However, the assignment of experimental 
units to treatments in these two layouts is quite different. In a randomized block design, 
the ¢ treatments are randomized within a block whereas in a two-way crossed model, 
a x b treatment combinations are completely randomized to a x b experimental units. 
Thus, the interpretation of the two models is quite different. The randomized block 
design of course can be extended to problems involving two or more factors. 


MISSING OBSERVATIONS 

The problem of missing data is treated similarly to that discussed in Section 3.10 
for the two-way classification with one observation per cell. 

RELATIVE EFFICIENCY OF THE DESIGN 

The relative efficiency (RE) of an experimental design in comparison to any 


other design can be evaluated in terms of the variance of the treatment.? In a 


3 In general, the relative efficiency is defined as the ratio of two variances. Thus, given two 
estimators T; and 7> of the same parameter, the relative efficiency of 7; compared to 7) is defined 
as Var(T2)/Var(7). The preceding derivation depends essentially upon this type of comparison 
of variances. For another approach to relative efficiency of a design, see Cochran (1937). 


494 The Analysis of Variance 


TABLE 10.6 
Data on Yields of Wheat Straw from 
a Randomized Block Experiment 


Treatment 
Block 1 2 3 4 
1 332 412 542 730 
2 260 384 472 590 
3 202 362 516 294 
4 210 348 458 560 


Source: Anderson (1946). Used with permission. 


randomized block design (RBD) with b blocks and ¢ treatments, let MSg and 
MSz; be the block and error mean squares, respectively. If a completely ran- 
domized design (CRD) were used with the same number b x t of experimental 
units as the RBD, then an estimate of the error variance would be obtained as 


(b — 1)MSz3 + b(t — 1)MS_E 
bt — 1 


However, the error mean square of the RBD with the same number of experi- 
mental units is actually MS_-. Hence, the RE of the RBD compared to CRD is 
given by 


_ (= 1)MSz + b@ — 1I)MSz 


RE 
(bt — 1)MS¢ 


REPLICATIONS 


In using a randomized block design (RBD), it is sometimes desirable to repli- 
cate each block-treatment combination on 7 experimental units. Such a design 
is commonly known as generalized randomized block design (GRBD). The 
principal advantage of the GRBD over the RBD lies in the fact that it allows the 
estimation of interaction effects between blocks and treatments. The analysis 
of this design proceeds in exactly the same manner as for the two-way crossed 
classification with interactions discussed in Chapter 4 with the only difference 
that the factor A now designates ‘blocks’ and the factor B denotes “treatments.” 


WoRKED EXAMPLE 


Anderson (1946) reported data on the yields of wheat straw from an experiment 
using arandomized block design with four blocks and four treatments. A portion 
of the data are given in Table 10.6. 


Some Simple Experimental Designs 495 


TABLE 10.7 

Analysis of Variance for the Data on Yields of Wheat 
Straw 

Source of Degreesof Sums of Mean 

Variation Freedom § Squares Square Fvalue p-value 
Block 3 4,362  18,120.667 2.60 0.117 
Treatment 3 206,394  68,798.000 9.88 0.003 
Error 9 62,700 6,966.667 

Total 15 323,456 


The analysis of variance calculations are readily performed and the results are 
summarized in Table 10.7. The outputs illustrating the applications of statistical 
packages to perform the analysis of variance are presented in Figure 10.3. Here, 
the ratio of mean squares for treatments is highly significant (p = 0.003) and 
there is very strong evidence of real treatment differences. The block effects 
seem to be insignificant (p = 0.117) and there may be some question regarding 
the effectiveness of the blocking. 


10.4 LATIN SQUARE DESIGN 


The randomized block design was used to reduce experimental error by elimi- 
nating a source of variation in experimental units by utilizing the principle 
of blocking. The Latin square design eliminates two extraneous sources of 
variation in experimental units by using two-way or double blocking on the 
experimental units. The rows and columns are then used for two mutually or- 
thogonal systems of blocks and the letters are used for treatments. In agricultural 
experiments, the rows and columns are usually strips of land, with row strips at 
right angles to the column strips, and the plots are the intersection of strips in 
different directions. In this sense we can say that the Latin square is an extension 
of the randomized block design. 

In general, a Latin square for p treatments, or a p x p Latin square, 1s 
a square matrix with p rows and p columns. Each of the resulting p* cells 
contains one of the p letters. Each letter corresponds to one of the treatments 
and each letter occurs once and only once in each row and each column. A 
Latin square of any order can be obtained most easily by simply writing the 
letters in their natural order in the first column and then completing each row by 
other letters cyclically, that 1s, with symbols again in the same order except that 
the last letter is followed by the first. Some of the examples of Latin squares 
are given in Figure 10.4. Appendix X contains some more representations of 
Latin squares from 3 x 3 to 12 x 12. Some more examples are given in Norton 


496 The Analysis of Variance 


DATA WHEATSTRAW; The SAS System 
INPUT BLOCK TRTMENT YIELD; Analysis of Variance Procedure 
DATALINES; 
1 332 Dependent Variable: YIELD 
2 412 Sum of Mean 
3 542 Source DF Squares Square ¥F Value Pr>F 


4 560 Model 6 260756.0000 43459.3333 6.24 0.0079 

i Error 9 62700.0000 6966. 6666 

PROC ANOVA; Corrected 15 323456.0000 

CLASSES BLOCK TRTMENT; Total 

MODEL YIELD=BLOCK TRTMENT; R~Square Cc.V. Root MSE YIELD Mean 
RUN; 0.806156 20.015962 83. 466560 417.0000 
CLASS LEVELS VALUES 
BLOCK 4 1234 
TRIMENT 4 1234 
NUMBER OF OBS. IN DATA BLOCK 3 54362.0000 18120.6667 2.60 0.1165 
1} SET=16 TRIMENT 3 206394.0000 68798 .0000 88 


Source DF Anova SS Mean Square F Value Pr>F 


design. 


DATA LIST Analysis of Variance-~-Design 1 
/BLOCK 1 TRIMENT 3 
YIELD 5-7. Tests of Significance for YIELD using UNIQUE sums of squares 


Source of Variation ss DF MS F Sig of 
RESIDUAL 62700. 


9 : 
BLOCK 54362. 3 . . 117 
TRIMENT 206394. 3 003 


J MANOVA YIELD BY (Model) 260756. . . .008 
BLOCK (1, 4) (Total) 323456. 
TRTMENT (1, 4) 
/DESIGN=BLOCK R-Squared = .806 
TRIMENT. Adjusted R-Squared = .677 


design. 


/INPUT FILE='C: \SAHAI BMDP2V - ANALYSIS OF VARIANCE AND COVARIANCE WITH 
\TEXTO\EJE24.TXT'. REPEATED MEASURES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 

VARIABLES=3. ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 

/VARIABLE NAMES=BL,TRE, YIELD. |THE TRIALS ARE REPRESENTED BY THE VARIABLES: YIELD 
VARIABLE=BL, TRE. 

CODES (BL)=1,2,3,4. THE HIGHEST ORDER INTERACTION IN EACH TABLE HAS BEEN 
NAMES (BL) =B1, B2,B3, REMOVED FROM THE MODEL SINCE THERE IS ONE SUBJECT PER 
B4. CELL 
CODES (TRE) =1,2,3,4 
NAMES (TRE) =T1,T2,T3, | SOURCE SUM OF D.F MEAN TAIL 
| T4. SQUARES SQUARE PROB. 

1 /DESIGN DEPENDENT=YIELD. 

i /END MEAN 2782224.00000 1 2782224.00000 399.36 0.0000 
1 1 332 BLOCK 54362.00000 3 18120.66667 2.60 0.1165 
Le TREATM 206394.00000 3 68798.00000 9.88 0.0033 

‘ 4 560 ERROR 62700.00000 9 6966. 66667 


(iii) BMDP application: BMDP 2V instructions and output for the randomized block 
design. 


FIGURE 10.3 Program Instructions and Output for the Randomized Block 
Design: Data on Yields of Wheat Straw from a Randomized Block Design 
(Table 10.6). 


Some Simple Experimental Designs 497 


(1939), Cochran and Cox (1957, pp. 145-146), and Fisher and Yates (1963, 
pp. 86-89). 


3 x3 4x4 5x5 6x6 
ABC ABCD ABCDE ABCDEF 
BCA BCDA BCDEA BCDEFA 
CAB CDAB CDEAB CDEFAB 
DABC DEABC DEFABC 


EABCD EFABCD 
FABCDE 


FIGURE 10.4 Some Selected Latin Squares. 


For a given size p, there are many different p x p Latin squares that can 
be constructed. For example, there are 576 different possible 4 x 4 Latin 
squares, 161,280 different 5 x 5 squares, 812,851,200 different 6 x 6 squares, 
61,428,210,278,400 different 7 x 7 squares, and the number of possible squares 
increases vastly as the size of p increases. The smallest Latin square that can 
be used is a 3 x 3 design. Latin squares larger than 9 x 9 are rarely used due 
to the difficulty of finding equal numbers of groups for the rows, columns, 
and treatments. The randomization procedures for Latin squares were initially 
given by Yates (1937a) and are also described by Fisher and Yates (1963). The 
proper randomization scheme consists of selecting at random one of the ap- 
propriate size Latin squares from those available. Randomization can also be 
carried out by randomly permuting first the rows and then the columns, and 
finally randomly assigning the treatments to the letters. 

Latin squares were first employed in agricultural experiments where soil con- 
ditions often vary row-wise as well as column-wise. Treatments were applied 
in a field using a Latin square design in order to randomize for any differences 
in fertility in different directions of the field. However, the design was soon 
found to be useful in many other scientific and industrial experiments. Latin 
squares are often used to study the effects of three factors, where the factors 
corresponding to the rows and columns are of interest in themselves and not 
introduced for the main purpose of reducing experimental error. Note that in a 
Latin square there are only p” experimental units to be used in the experiment 
instead of the p* possible experimental units needed in a complete three-way 
layout. Thus, the use of the Latin square design results in the savings in ob- 
servations by a factor of 1/p observations over the complete three-way layout. 
However, this reduction is gained at the cost of the assumption of additivity 
or the absence of interactions among the factors. Thus, in a Latin square, it is 


498 The Analysis of Variance 


very difficult (often impossible) to detect interaction between factors. To study 
interactions, other layouts such as factorial designs are needed. 


MODEL AND ANALYSIS 


The analysis of variance model for a Latin square design 1s 
i=1,2,.. 
Vijk = UM+Q; + Bj +h] + eijx J=1,2,...,p (10.4.1) 
k=1,2 


where y;;x denotes the observed value corresponding to the i-th row, the j-th 
column, and the k-th treatment; —co < pz < o© Is the overall mean, «a; 1s the 
effect of the i-th row, 8; is the effect of the j-th column, t; is the effect of the k-th 
treatment, and e;;, is the random error. The model is completely additive; that is, 
there are no interactions between rows, columns, and treatments. Furthermore, 
since there is only one observation in each cell, only two of the three subscripts 
i, J, and k are needed to denote a particular observation. This is a consequence 
of each treatment appearing exactly once in each row and column. 

The analysis of variance consists of partitioning the total sum of squares of 
the N = p’ observations into components of rows, columns, treatments, and 
error by using the identity 


Yijk — ¥.. = W.. — VAG. — XD + Ok — Yz.) 
+ (vijk — Vi. — Yj. — Vik + 2Y_,). 


Squaring each side and summing over /, j, k, and noting that (7, j, k) take on 
only p” values, we obtain 


SSr = SSre + SSc + SS, + SSzE, (10.4.2) 


where 


t=] j=l k=1 
Pp 
SSe = p> Gi. - 5.) 
i=1 
Pp 
SSc =p) (93.-9..)" 
j=1 


Some Simple Experimental Designs 499 


TABLE 10.8 
Analysis of Variance for the Latin Square Design 


Expected Mean Square* 


Source of Degreesof Sumsof Mean =— ——————————— 
Variation Freedom Squares Square Model | Model Il —-F Value 
p & 
Row p-1 SSr_ MSr oo + —— Day 02+ po2 MSr/MS¢ 
P~* j=l 
Column p-1 SSc MSc o2+ — >- 83 02+ p02 MSc/MSz 
p—- : 
j=1 
P 
Treatment p-1 SS, MS, o? + — > t o? + pt MS,/MSe 
P~ * k=l 
Error (p—1)\(p—2)  SSe MSe o2 
Total p?—-1 SSr 


* The expected mean squares for the mixed model are not shown, but they can be obtained by 
replacing the appropriate term by the corresponding term as one changes from fixed to random 
effect; for example, replacing )-?_, a?/(p — 1) by of. 


=! 


and 
Pp ?P 
SS_ = » 2 > Oni — i. — V7. — Pa. +29_). 


The corresponding degrees of freedom are partitioned as 


Total Rows Columns Treatments Error 
p?-1=(p—1)+ (p—-1) + (p-1) +(p—-1)(p-2) 


The usual assumptions of the fixed effects model are: 


and the e;;,’s are normal random variables with mean zero and variance o?. 


Under the assumptions of the random effects model, the a;’s, B;’s, and T;’s are 
also normal random variables with mean zero and variances o?, Op and o?, 
respectively. Other assumptions leading to a mixed model can also be made. 
Now, the expected mean squares are readily derived and the complete analysis 
of variance is shown in Table 10.8. Furthermore, it can be shown that under 
the fixed effects model, each sum of squares on the right-hand side of (10.4.2) 
divided by co? is an independently distributed chi-square random variable. 


500 The Analysis of Variance 


Under Model I, the appropriate statistic for testing the hypothesis of no 
treatments effects, that is, 


Aj: =m=-:-=t) =0 
versus 
H;:%] # 0 for at least onek,k = 1,2,..., p, 


1S 
F, = MS,/MSz, 


which is distributed as F[p — 1, (p — 1)(p — 2)] under the null hypothesis and 
as F’[p — 1, (p — 1)(p — 2); A] under the alternative, where 


_ Pp 2 
~ 262(p — 1) 
e k=1 

The only hypothesis generally of interest in a Latin square design is the one con- 
cerning the equality of treatments under Model I as given previously. However, 
one may also test for no row effects and no column effects by forming the ra- 
tio MSr/MSe or MSc /MS_. However, since the rows and columns represent 
restrictions on randomization, these tests may not be appropriate. If rows and 
columns represent factors and any real interactions are present, they will inflate 
the MS; and will make the tests less sensitive. If p < 4, the design 1s consid- 
ered to be inadequate for providing sufficient degrees of freedom for estimating 
experimental error. 


POINT AND INTERVAL ESTIMATION 


Estimates of various parameters of interest in a Latin square design are readily 
obtained along with their sample variances. For example, under Model I, we 
have 


p=. Var(§...) = 07 / ps 
a _ _ 2 
A+ = Yj.., Var(yi..) = 0% / D; 
Q@; = Vi. — Y.., Var(i.. — 9...) = (p — 1)02 / p?; 
a 
a; — Oy = Yi. — Vir, Var(¥i.. — yir..) = 202 / p; 
Po p p p p 
> fia, = Do Yi... var( 3 ai.) = (= 2) of |p (= {; = 0); 
i=] i=l] i=] i=] i=] 
3? = MSz, Var(MSz) = 203 /(p — 1)(p — 2). 


A 100(1 — a) percent confidence interval for o2 is given by 


(p — 1)(p — 2)MSe 2 (p — 1)(p — 2)MSe 
xp —1(p—2),1-a/2]  * — x*M(p — 1)(p — 2), @/2] 


Some Simple Experimental Designs 501 


Confidence intervals for fixed effects parameters considered previously can be 
constructed from the results of their sampling variances. 


POWER OF THE F TEST 


One can calculate the power of the test in the same manner as discussed in 
earlier chapters. For example, the noncentrality parameter ¢ with respect to the 
hypothesis H, is given by 


Here, v; = p — 1 and vy = (p — 1)(p — 2). Except for this modification, the 
power calculations remain unchanged. 


MULTIPLE COMPARISONS 


For Models I and III, Tukey, Scheffé, and other procedures described in 
Section 2.19 may be readily adapted for use with the Latin square design. 
For example, consider a contrast of the form 


P 


L= £30; (yo -0), 
1 i=] 


i= 


which 1s estimated by 


Then, using the Tukey’s method, ZL is significantly different from zero with 
confidence coefficient 1 — a if 


~ 


L 
5 > q[p,(p — 1)(p — 2);1—a]. 


] 
J p-'MS¢ (; lei 


i=] 


If the Scheffé’s method is applied to these comparisons, then L is significantly 
different from zero with confidence coefficient 1 — @ if 


L 


Pp 
(p — 1)MSz (> ; / ) 


> {F[p —1,(p— 1)(p — 2);1 — a}. 


i=l 


Similar modifications are made for other contrasts and procedures. 


502 The Analysis of Variance 


COMPUTATIONAL FORMULAE 


The computation of sums of squares can be performed easily by using the 
following computational formulae: 


SSp = y - =, 

p — Le. p? 

l Pp y? 
SSo=—) y, -S, 

p d, vp? 

| Pp y? 
SS,=-—) y,-, 

T Pp > wk p? 

Pp Pp Pp 5 y? 

SSr = DD Yin — 


~ 
II 
— 
ne 
II 
— 
oa 
II 
— 


and 
SSe = SSr — SSr — SSc — SS;. 


MISSING OBSERVATIONS 


When a single observation y;;; is missing, its value is estimated by 


PCy; + yj. + y') —2y!, 
(p — 1)(p — 2) 


where the primes indicate the previously defined totals with one observation 
missing. After substituting the estimate (10.4.3) for the missing value, the sums 
of squares are calculated in the usual way. To correct the treatment mean squares 
for possible bias, the quantity 


wa 


Yijk = ’ (10.4.3) 


[y) — yy. -y¥, —-@- Dy? 
(p — 13 (p — 2)? ) 


is subtracted from the treatment mean square. The variance of the mean of the 
treatment with a missing value 1s 


Var(j )= | +§ 
ee pal! pe-De-DI 


and the variance of the difference between two treatment means (involving one 
with the missing value) is 


Baia) 
p (p-1j(p-2)] * 


Some Simple Experimental Designs 503 


which is slightly larger than the usual expression 207/p for the case of no 
missing value. 

For several missing values, more complicated methods are generally required. 
Formulae giving explicit expressions for several missing values can be found 
in Kramer and Glass (1960). However, for a few missing values, an iterative 
scheme may be used. The procedure is to make repeated use of the formula 
(10.4.3). When all missing values have been estimated, the analysis of variance 
is performed in the usual way with the degrees of freedom equal to the number 
of missing values subtracted from the total and error. Detailed discussions on 
handling cases with two or more missing values can be found in Steele and 
Torrie (1980, pp. 227—228) and Hinkelman and Kempthorne (1994, Chapter 10). 
The analysis of the design when a single row, column, or treatment is missing 
is given by Yates (1936b). The methods of analysis when more than one row, 
column, or treatment is missing are described by Yates and Hale (1939) and 
DeLury (1946). 


TESTS FOR INTERACTION 


Tukey (1955) and Abraham (1960) have generalized Tukey’s one degree of 
freedom test for nonadditivity to Latin squares. Snedecor and Cochran (1989, 
pp. 291-294) and Neter et al. (1990, pp. 1096-1098) provide some additional 
details and numerical examples. For some further discussion of the topic, see 
Milliken and Graybill (1972). Effects of nonadditivity in Latin squares have 
been discussed by Wilk and Kempthorne (1957) and Cox (1958b). 


RELATIVE EFFICIENCY OF THE DESIGN 


Suppose instead of a Latin square design (LSD), a randomized block design 
(RBD) with p rows as blocks is used. An estimate of the error variance would 
then be given by 


(p — 1)MSc + (p — 1)°MSe 
p(p — 1) | 


The preceding formula comes from the fact that the column mean square would 
be pooled with the error mean square as there are no columns in the RBD. 
However, the LSD under the same experimental conditions actually has the 
error mean square MSz. Hence, the relative efficiency (RE) of LSD relative to 
RBD with rows as blocks (called column efficiency) is given by 


(p — 1)MSc +(p — 1)*MS¢z 
p(p — 1)MSz 
MSc + (p — 1)MS_- 
~ pMSe 


RE gotumn = 


504 The Analysis of Variance 


Similarly, if the columns are treated as blocks, then the RE of LSD relative to 
RBD (called row efficiency) is given by 


MSr + (p — 1I)MSeE 


REyow = pM S 
E 


REPLICATIONS 


In using a small size Latin square, it is often desirable to replicate it. The usual 
model for a Latin square with r replications is 


1=1,2,...,p 
j =1,2,...,p 
Yijke = M+ +B +h + Pet Cijee Yeo. p 
| an | 2,225 T, 


where y;;x¢ denotes the observed value corresponding to the i-th row, the j-th 
column, the k-th treatment, and the @-th replication; —oo < yz < oo Is the 
overall mean, a; is the effect of the i-th row, 6; is the effect of the j-th column, 
t, is the effect of the k-th treatment, pz is the effect of the r-th replication, 
and é;j;x¢ is the customary error term. When a Latin square is replicated, it 1s 
important to know whether it is replicated using the same blocking variables 
or there are additional versions of one or both blocking variables. The analysis 
of variance for the general case in which a Latin square is replicated r times 
using the same blocking variables proceeds in the same manner as before. 
However, now, an additional source of variation due to replicates is introduced. 
The degrees of freedom for the rows, columns, and treatments are the same, 1.e., 
p — 1, but the degrees of freedom for the total, replicates, and error are given by 
rt? —1,r—1, and (p — 1)[r(p + 1) — 3] respectively. When a Latin square is 
replicated with additional versions of the row (column) blocking variable, the 
analysis remains the same except that now the degrees of freedom for the rows 
(columns) and the error are r(p — 1) and (p — 1)(rp — 2) respectively. When 
a Latin square is replicated with additional versions of both row and column 
blocking variables, the degrees of freedom for the rows, columns, and error are 
now given by r(p — 1), r(p — 1), and (p — 1)[r(p — 1) — 1] respectively. 


Remark: Latin squares were proposed as experimental designs by R. A. Fisher (1925, 
1926) and in 1924 he made some early applications of Latin squares in the design of 
an experiment in a forest nursery. A Latin square experiment for testing the differences 
among four treatments for warp breakage, where time periods and looms were used as 
rows and columns, has been described by Tippett (1931). Davies (1954) describes one of 
the earliest industrial applications of Latin squares related to wear-testing experiments 
of four materials where the runs and positions of a machine were represented as rows and 
columns. For a survey of Latin square designs in agricultural experiments, see Street and 


TABLE 10.9 


Data on Responses of Monkeys to Different 


Some Simple Experimental Designs 


Stimulus Conditions 


Monkey 


a &wWN = 


1 


194 (B) 
202 (D) 
335 (C) 
515 (E) 
184 (A) 


2 


369 (D) 
142 (B) 
301 (A) 
590 (C) 
421 (E) 


Week 
3 


344 (C) 
200 (A) 
493 (E) 
552 (B) 
355 (D) 


4 


380 (A) 
356 (E) 
338 (B) 
677 (D) 
284 (C) 


5 


693 (E) 
473 (C) 
528 (D) 
546 (A) 
366 (B) 


505 


Source: Snedecor (1955). Used with permission. 


TABLE 10.10 
Analysis of Variance for the Data on Responses of Monkeys 
to Different Stimulus Conditions 


Source of Degrees of Sums of Mean 

Variation Freedom Squares Square Fvalue p-value 
Monkey 4 262,961.040  65,740.260 18.51 <0.001 
Week 4 144,515.440 36,128.860 10.17 <0.001 
Stimulus 4 111,771.440 = 27,942.860 7.87 0.002 
Error 12 42,628.320 3,552.360 

Total 24 561,876.240 


Street (1988). The principal reference book on Latin squares is by Dénes and Keedwell 
(1974). For a discussion of combinatorial problems in Latin squares, see Street and 
Street (1987). 


WORKED EXAMPLE 


Snedecor (1955) reported data from an experiment conducted to study responses 
of pairs of monkeys to a certain kind of stimulus under a variety of conditions. 
The responses were measured on five pairs of monkeys during five successive 
weeks under five different conditions using a Latin square design. The data are 
given in Table 10.9 where the letter within parentheses represents the stimulus 
condition used. 

The analysis of variance calculations are readily performed and the results are 
summarized in Table 10.10. The outputs illustrating the applications of statis- 
tical packages to perform the analysis of variance are presented in Figure 10.5. 


506 The Analysis of Variance 


DATA MONKEYS; The SAS System 
INPUT MONKEY WEEK Analysis of Variance Procedure 
STIMULUS $ RESPONSE; Dependent Variable: RESPONSE 
DATALINES; 
11 8B 194 Sum of Mean 
. Source Squares Square F Value Pr > F 


15 5 B 366 


; Model 519247.92000 43270.66000 12.18 0.0001 
PROC ANOVA; Error 42628.32000 3552.36000 

CLASSES MONKEY WEEK 

STIMULUS; Corrected 24 561876.24000 

MODEL RES PONSE=MONKEY Total 

WEEK STIMULUS; R-Square c.V. Root MSE RESPONSE Mean 
RUN; 0.924132 15.145781 59.601678 393.52000000 

CLASS LEVELS VALUE 
MONKEY 5 23 

} WEEK 5 23 

STIMULUS 5 BC MONKEY 262961.0400 65740.2600 18.51 0.0001 
NUMBER OF OBS. IN DATA WEEK 144515.4400 36128.8600 10.17 0.0008 
SET=25 STIMULUS 111771.4400 27942.8600 7.87 0.0024 


S 
4 Source DF Anova SS Mean Square F Value Pr > F 
4 
D 


(i) SAS application: SAS ANOVA instructions and output for the Latin square design. 


DATA LIST Analysis of Variance--Design 1 
/MONKEY 1 WEEK 3 

STIMULUS 5 Tests of Significance for RESPONSE using UNIQUE sums of squares 
RESPONSE 7-9. 
BEGIN DATA. Source of Variation Ss MS F 
11 2 194 

11 4 202 RESIDUAL 42628. . 

woe ef MONKEY 262961. . 18.51 
5 5 2 366 WEEK 144515. . 10.17 
END DATA. STIMULUS 111771. . 7.87 
MANOVA RESPONSE BY 

MONKEY (1, 5) (Model) 519247. . 12.18 
WEEK (1,5) (Total) 561876. 

STIMULUS (1,5) 

/DESIGN=MONKEY R-Squared = ~924 

WEEK STIMULUS. Adjusted R~Squared = - .848 


(ii) SPSS application: SPSS MANOVA instructions and output for the Latin square 
design. 


/INPUT FILE='C: \SAHAI BMDP2V - ANALYSIS OF VARIANCE AND COVARIANCE WITH 
\TEXTO\EJE25.TXT'. 
FORMAT=FREE. REPEATED MEASURES Release: 7.0 (BMDP/DYNAMIC) 
VARIABLES=4. 
| /VARIABLE NAMES=M,W,S, RESP. 
/GROUP VARIABLE=M,W,S. ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 
CODES (M)=1,2,3,4,5. 
NAMES (M) =M1,...,M5. 
CODES (W)=1,2,3,4,5. | THE TRIALS ARE REPRESENTED BY THE VARIABLES: RESPONSE 
NAMES (W) =W1,...,W5. 
CODES (S)=1,2,3,4,5. SOURCE SUM OF D.F. MEAN F TAIL 
NAMES (S)=A,B,C,D,E. SQUARES SQUARE PROB. 
/DESIGN  DEPENDENT=RESP. 
INCLUDE=1,2,3. MEAN 3871449.76000 1 3871449.76000 1089.82 0.0000 
/END MONKEY 262961.04000 4 65740.26000 18.51 0.0000 
112 194 WEEK 144515.44000 4 36128.86000 10.17 0.0008 
~ 8 8 STIMULUS 111771.44000 4 27942.86000 7.87 0.0024 
5 5 2 366 ERROR 42628.32000 12 3552.36000 


(iii) BMDP application: BMDP 2V instructions and output for the Latin square design. 


FIGURE 10.5 Program Instructions and Output for the Latin Square Design: 
Data on Responses of Monkeys to Different Stimulus Conditions (Table 10.9). 


Some Simple Experimental Designs 507 


Here, there is a very significant effect due to stimulus conditions. The effects 
due to monkeys and weeks are also highly significant. The use of the Latin 
square design seems to be highly effective. 


10.5 GRAECO-LATIN SQUARE DESIGN 


We have seen that the Latin square design 1s effective for controlling two sources 
of external variation. The principle can be further extended to control more 
sources of variation. The Graeco-Latin square is one such design that can be used 
to control three sources of variation. The design is also useful for investigating 
simultaneous effects of four factors: rows, columns, Latin letters, and Greek 
letters, in a single experiment. The Graeco-Latin square design is obtained by 
juxtaposing or superimposing two Latin squares, one with treatments denoted 
by Latin letters and the other with treatments denoted by Greek letters, such 
that each Latin letter appears once and only once with each Greek letter. The 
designs have been constructed for all numbers of treatments* from 3 to 12. Some 
selected Graeco-Latin squares are shown in Figure 10.6. Some more examples 
are given in Appendix Y, Cochran and Cox (1957, pp. 146-147), and Fisher 
and Yates (1963, pp. 86-89). 


3x3 4x4 5x5 
Aa By CB Aa By Cé DB Aa By Ce DB E6 
BB Ca Ay BB Aé Dy Ca BB Cd Da Ey As 
Cy AB Ba Cy Da AB Bé Cy De EB Ad Ba 
Dd CB Ba Ay Dé Ea Ay Be CB 


Ee AB Bd Ca Dy 


FIGURE 10.6 Some Selected Graeco-Latin Squares. 


MODEL AND ANALYSIS 


The analysis of variance model for a Graeco-Latin square design is 


1,2,...,p 
=1,2,...,p 

Vijke = + Oj + Bj + Te + de + Cijne 12 D 
1,2 


y++-y DP, 


+ Graeco-Latin squares exist for all orders except 1, 2, and 6. The problem of nonexistence of 
Graeco-Latin squares for certain values of p goes back well over 200 years, when the Swiss 
mathematician Euler (1782) conjectured that no p x p Graeco-Latin square exists for p = 4m+2 
where m is a positive integer. In 1900, Euler’s conjecture was shown to be true for m = 1; that 
is, there does not exist a 6 x 6 Graeco-Latin square. However, his conjecture was shown to be 
false for m > 2 by Bose and Shrikhande (1959) and Parker (1959). 


508 The Analysis of Variance 


where y,;xe¢ is the observation corresponding to the i-th row, the j-th column, 
the k-th Greek letter, and the @-th Latin letter; —co < fz < o© Is the overall 
mean, @; is the effect of the i-th row, B; is the effect of the j-th column, Tt, is 
the effect of the k-th Greek letter, 5, is the effect of the 2-th Latin letter, and 
€;;x¢ iS the random error. The model assumes additivity of the effects of all four 
factors; that is, there are no interactions between rows, columns, Greek letters, 
and Latin letters. Furthermore, note that only two of the four subscripts i, /, k, 
and £ are needed to identify a particular observation. This is a consequence of 
each Greek letter appearing exactly once with each Latin letter and in each row 
and column. | 

The analysis of variance of the design is very similar to the Latin square 
design. The partitioning of the total sum of squares of the N = p? observations 
into components of rows, columns, Greek letters, Latin letters, and the error 1s 
given by 


SS7r = SSr + SSc + SSG + SS, + SSe, 


where 
Pp op p »p 
SS; = Sd > OniKe — yy, 
i=l] j=l k=1 @=1 
p 
SSp = Pp (Vi... — 5...) 
i=l 
p 
SSc = P (95. — 5...) 
j=l 
p 
SSgc = PY Ox —y_y, 
k=1 
p 
SSp =P) G.0- 3...) 
l=1 
and 


The corresponding degrees of freedom are partitioned as 


Greek Latin 
Total Rows Columns Letters Letters Error 
pP?>-1=(p—-1)+ (p—-)D +(p-)D+(p-)+(:p- Dip - 3) 


Some Simple Experimental Designs 509 


TABLE 10.11 
Analysis of Variance for the Graeco-Latin Square Design 


Expected Mean Square* 


Source of Degrees of Sumsof Mean ——————————— 
Variation Freedom §$ Squares Square Model I Model ll =F Value 
p 
Row p-l SSe_ MSr_ of + —— ia? 02+ poz MSr/MSz 
P~* j=l 
Column p-1 SSc MSc oa? + — >> B? o2 + poz MSc/MSe 
p-—i* 
j=l 
2 P 2 22 2 
Greek Letter p-\ SSc MSc of +—— Ste 02+ po? MSc/MSe 
Pp— k=] 
p 
Latin Letter p—-1 SS; MS; o2+4+ —— \° 82 02+ po? MSz./MSz 
Pp— l=] 
Error (p-—1)(p—-3) SSe  MSgE oe? o2 
Total p> —1 SS7 


* The expected mean squares for the mixed model are not shown, but they can be obtained by 
replacing the appropriate term by the corresponding term as one changes from fixed to random 
effect; for example, replacing )-?_, a?/(p — 1) by of. 


The usual assumptions of the fixed effects model are: 


P P P 


di =D bi = DK = 


P 
dg =0 
i=] j=l k=1 é=1 


and the e;;x¢’s are normal random variables with mean zero and variance a2. 


Under the assumptions of the random effects model, the a;’s, B;’s, T;’s, and 
d¢’S are also normal random variables with mean zero and variances a2, a2, o?, 
o;, and a2, respectively. Other assumptions leading to a mixed model can also 
be made. Now, the expected mean squares are readily derived and the complete 
analysis of variance is shown in Table 10.11. The null hypotheses of equal 
effects for rows, columns, Greek letters, and Latin letters are tested by dividing 


the corresponding mean squares by the error mean square. 


Remark: The Graeco-Latin square design has not been used much because the exper- 
imental units cannot be easily balanced in all three groupings. Some early applications 
were described by Dunlop (1933) for testing 5 feeding treatments on pigs and by Tippett 
(1934) involving an industrial experiment. Perry et al. (1980) describe an application 
and advantages and disadvantages of the design in experiments for comparing different 
insect sex attractants. Discussions of the analysis of variance of the design when some 
observations are missing can be found in Yates (1933), Nair (1940), Davies (1960), and 
Dodge and Shah (1977). When p < 6, the number of error degrees of freedom is rather 
inadequate and the design is not practical (see Cochran and Cox (1957, p. 133)). 


510 


TABLE 10.12 
Data on Photographic Density for Different Brands of 
Flash Bulbs* 


The Analysis of Variance 


Camera 
Film 1 2 3 4 5 
1 0.64(Aq) 0.70(By) 0.73(Ce) 0.66(DB) 0.66 (E35) 
2 0.62(BB) 0.63(C5) 0.69(Da) O.70(Ey) 0.78 (Ae) 
3 0.65(Cy) 0.72(De) 0.68 (EB) 0.64 (Ad) 0.74 (Ba) 
4 0.64(Dé6) 0.73(Ea) O0.68(Ay) 0.74 (Be) 0.72 (CB) 
5 0.74(Ee) 0.73(AB) 0.67(B5) 0.74(Ca) 0.78 (Dy) 


Source: Johnson and Leone (1964, p. 175). Used with permission. 


* The original experiment reported duplicate measurements. Only the first 


set of readings are presented here. 


TABLE 10.13 
Analysis of Variance for the Data on Photographic 
Density for Different Brands of Flash Bulbs 


Source of Degreesof Sumsof Mean 

Variation Freedom Squares Square Fvalue p-value 
Film 4 0.00950 0.00237 7.18 0.010 
Camera 4 0.01558 0.00389 11.79 0.002 
Brand 4 0.00026 0.00006 0.18 0.936 
Filter 4 0.02398 0.00599 18.15 <0.001 
Error 8 0.00267 0.00033 

Total 24 0.05198 


WorRKED EXAMPLE 


Johnson and Leone (1964, p. 175) presented data from an experiment conducted 
to study the effect of different brands of flash bulbs on photographic density. 
A 5 x 5 Graeco-Latin square design with 5 varieties of cameras, 5 film types, 
and 5 filter types was used. The data are given in Table 10.12 where the Roman 
letter within parentheses represents the brand and the Greek letter represents 
the filter type. 

The analysis of variance calculations are readily performed and the results 
are shown in Table 10.13. The outputs illustrating the applications of statistical 
packages to perform analysis of variance are presented in Figure 10.7. There 
does not seem to be a significant effect of different brands of flash bulbs on 


DATA PHOTOGRPH; The SAS System 

INPUT FILM CAMERA BRAND Analysis of Variance Procedure 
$ FILTER $ DENSITY; 

CARDS; Dependent Variable: DENSITY 

11Aaqa 0.64 


oe ee . Sum of Mean 

55 Dy 0.78; Source DF Squares Square F Value Pr > F 
PROC ANOVA; 

CLASSES FILM CAMERA Model 16 0.04930400 0.00308150 9.23 0.0017 
BRAND FILTER; Error 8 0.00267200 0.00033400 


CAMERA BRAND FILTER; Total 
RUN; R-Square c.V. Root MSE DENSITY Mean 
CLASS LEVELS VALUES 0.948592 2.6243060 0.01827567 0.69640000 


FILM 5 
CAMERA 5 


2 

2 Source D Anova SS Mean Square F Value Pr > F 
BRAND 5 B 

B 

N 


4 

. FILM 0.00949600 0.00237400 7.11 0.0096 
5 CAMERA 0.01557600 0.00389400 11.66 0.0020 
BRAND 0.00025600 0.00006400 0.19 0.9361 
FILTER 0.02397600 0.00599400 17.95 0.0005 


FILTER 5 | 
NUMBER OF OBS. I 
SET=25 


E 
3 
3 
Cc 
yébe 
D 


(i) SAS application: SAS ANOVA instructions and output for the Graeco-Latin square 
design. 


DATA LIST Analysis of Variance--Design 1 

/FILM 1 CAMERA 3 

BRAND 5 Tests of Significance for DENSITY using UNIQUE sums of squares 
FILTER 7 

DENSITY 9-12(2). 
BEGIN DATA. Source of Variation ss Sig of F 
1111 0.64 

- ee . RESIDUAL .00 . 

5 543 0.78 FILM .O1 . . .010 
END DATA. CAMERA .02 . . .002 
MANOVA DENSITY BY BRAND .00 . . . 936 
FILM(1,5) FILTER .02 . . .000 
CAMERA (1,5) 
BRAND (1,5) (Model) 05 . . .002 
FILTER(1, 5) (Total) .05 

/DESIGN=FILM 

CAMERA BRAND R-Squared = -949 

FILTER. Adjusted R-Squared = .846 


(ii) SPSS application: SPSS MANOVA instructions and output for the Graeco-Latin 
square design. 


/ INPUT FILE='C: \SAHAI BMDP2V - ANALYSIS OF VARIANCE AND COVARIANCE WITH 
. \TEXTO\EJE26.TXT’. REPEATED MEASURES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. 
| VARIABLES=5. 
| /VARIABLE NAMES=F,C,B,FI,DENS. | ANALYSIS OF VARIANCE FOR THE 1-ST DEPENDENT VARIABLE 
/GROUP = VARIABLE=F,C,B, FI. 
CODES (F)=1,2,3,4,5. 
NAMES (F)=F1,...,F5. THE TRIALS ARE REPRESENTED BY THE VARIABLES: DENSITY 
CODES (C)=1,2,3,4,5. 
NAMES (C)=C1,..,C5. 
CODES (B)=1,2,3,4,5. SOURCE SUM OF D.F. MEAN 
NAMES (B) =B1,...,B5. SQUARES SQUARE 
CODES (FI)=1,2,3,4,5. 
NAMES (FI) =FI1,.., FIS. 
/DESIGN DEPENDENT=DENSITY. 12432 .12432 36300. 
INCLUDE=1, 2, 3, 4. .00950 -00237 7. 
/END 01558 00389 11. 
111 10.64 00026 00006 0. 
os ee . 02398 .00599 17. 
55 43 0.78 00267 00033 


(iii) BMDP application: BMDP 2V instructions and output for the Graeco-Latin square 
design. 


FIGURE 10.7 Program Instructions and Output for the Graeco-Latin Square 
Design: Data on Photographic Density for Different Brands of Flash Bulbs (Table 
10.12). 


512 The Analysis of Variance 


photographic density. The use of a Graeco-Latin square design in reducing 
variability due to varieties of camera, film types, and filter types seems to be 
highly effective. 


10.6 SPLIT-PLOT DESIGN 


Split-plot design can be considered as a special case of the two-factor random- 
ized block design where one wants to obtain more precise information about 
one factor and also about the interaction between the two factors, the second 
factor being of secondary importance to the experimenter. Thus, suppose there 
are two factors A and B having a and b levels, respectively. As described in 
the previous sections, one might use a completely randomized design by com- 
pletely randomizing the a x b treatment combinations, or a randomized block 
design (in, say, 7 randomized blocks), each block containing a x b plots. Alter- 
natively, suppose we wish to evaluate the effects of factor B and the interaction 
between the factors A and B with greater precision than the effects of factor A. 
In this situation, one could arrange the treatments of factor A in a randomized 
block design of r blocks as described earlier. Each of the a x r plots can then 
be divided into b subplots so that the treatments of factor B can now be allo- 
cated at random over each subplot. This design yields more precise information 
on the factor allocated to the split- or subplots at the expense of less precise 
information to the factor assigned to the whole-plots. 

As explained previously, the principal advantage of this type of design lies 
in the fact that since no attempt is being made to obtain an accurate information 
of factor A, larger plots can be used to allocate the first a treatments of factor A 
without any consideration of the variability within the blocks. Ifa = 3, b = 4, 
and r = 3, a split-plot design may be laid out as shown in Figure 10.8. 


Block II Block III 


| La) [B 
| Le [e. 
ss} [e| Le 


FIGURE 10.8 A Layout of a Split-Plot Design. 


Note that the essential feature of a split-plot design is that instead ofa x b xr 
experimental units obtained after random allocation over the entire a x b x r 


Some Simple Experimental Designs 513 


units as in a completely randomized design, or obtained after r separate ran- 
domizations over a x b units, as in a simple randomized block design, they 
are obtained by first randomizing treatments of factor B on the b subplots (this 
randomization being performed a x r times) and then randomizing the treat- 
ments of factor A onto the a whole-plots (this randomization being performed 
r times, once for each of the r blocks). The name split-plot has its origin in 
agricultural experimentations where the terms whole-plots (large areas of land) 
and subplots (small areas of land) are in common use. 

In a split-plot design when there is a choice, the more important treatments 
requiring a higher level of precision should be assigned to the subplots and 
the treatments of secondary importance should be assigned to the whole-plots. 
However, in many industrial and laboratory experiments, the treatments that 
cannot be administered in small scale are applied to whole-plots and the treat- 
ments that can be conveniently applied to small scale are assigned to the sub- 
plots. This choice of a split-plot design is dictated purely by administrative and 
logistic considerations rather than the precision of the desired information. 


MODEL AND ANALYSIS 


The model for the split-plot design described previously is 
i=1,... 
Vijk = Mt B+ aj + ei; + Be + (BB)ik + QB) jn + Eijk YJ = 1,---,4 
k=1 


where yz is the general or overall mean, B; is the effect of the i-th block, a; is 
the effect of the j-th treatment of factor A, e;; 1s the whole-plot error; B; is the 
effect of the k-th treatment of factor B, (BB);, 1s the interaction between the 
i-th block and the k-th treatment of factor B, (a@B) x is the interaction between 
the j-th treatment of factor A and the k-th treatment of factor B, and é;;, is the 
subplot error. Note that e;; is the same as the (Ba);; interaction and &;;, is the 
same as the (BaB);;x interaction. 

Usually, the blocks are considered as random and factors A and B are fixed. 
Thus, the B;’s, (BB);x’s, e;;’s, and €;;,4’s are normally distributed with mean zero 
and variances 0, Op. o2, and o2, respectively. If both A and B are random, 
the a5, Bys, and (@B)jxs are assumed to be normally distributed with zero 
means and variances o2, Op. and o2,, respectively. Mixed models with A fixed 
and B random, or B fixed and A random can also arise and their assumptions 
are analogously stated. The analysis of variance is performed in exactly the 
same manner as before. Thus, the total sum of squares is partitioned by the 
identity 


SSr = SSge + SS4 + SSE + SSB t+ SSpexe + SSaxeptSSe, 


514 The Analysis of Variance 


where 


SS7r = — Ye von _ y.. y, 


i=1 j=l k= 


SSpe = ab SG: — ~_y, 
i=l 


SS, =rb (9 j.- 5.) 


j=l 


S86 = bY Yu, - LT SYi;. + 5), 


a 


SSp =ra er -— 5), 
k=1 


r b 
SSpexp =a > Six — Vi. — Vat yy, 


i=] k=] 


SSaxp =r yo jk —¥j. — Vat y), 


j=l k= 


and 

r a b 
SSe = S“(igk — 5 Vij. — Vie — Vj +54. AIH 5 
i=l j=l k=1 


The complete analysis of variance including the degrees of freedom and 
expected mean squares is shown in Table 10.14. When both A and B are fixed, 
we can test block and factor A effects against the whole-plot error. Similarly, 
Bé x B and A x B interactions can be tested against the subplot error. The 
factor B effect can be tested against the Bé x B interaction. Sometimes, the 
Bé x B interaction is also considered to be negligible and not included in 
the model. Then it is pooled with the subplot error, and the B main effect as 
well as A x B interaction are tested against the subplot error. Under Models 
II and III, however, exact tests may not always exist and psuedo-F tests as 
discussed in Section 5.5 would have to be employed. 


Remarks: (i) The split-plot technique may also be applied to a Latin square design. The 
a x a Latin square corresponds to the whole-plot treatments. Each whole-plot can be 
further subdivided into b subplots. Now, A treatments are applied randomly to whole- 
plots and B treatments are applied randomly to subplots within a whole-plot. Statistical 
analysis of such a design proceeds on lines similar to that of a randomized block. The 
first stage is an analysis of the a* whole-plots and the second stage is an analysis of the 
subplots within whole-plots. 

(ii) The problem of estimating missing values in a split-plot design has been studied 
by Anderson (1946) and Khargonkar (1948). Formulae for estimating the standard errors 


515 


Some Simple Experimental Designs 


dn 3 
704 + 70 


aq 3 
70D + 72 


g da 3 
zODd + "70D + 70 


[—D 
qd 
foqo + Yon + 70q + 72 


+ od + 70q + fo 


wopuey g ‘paexi4 Y 


70gt + 70g + 70 


fogp + 41ov + 70q+ 70 


pexiy g ‘wopuey y 


Il |PPOW 


qn 
70d + 


“ov + 30 


on. + 1 Od + “ov + 30 
a 3 


zd + 7? 


rq + od + 74 + 70 


foqp + “ov + 70q + 70 


Il |PPOW 


q oD 
(I-gi-%)  ,; 


1=y 1=f 


do) _¢ <x 


+ 70 


qi 
foqv + 20q + 70 
z z z 


| apow 


aaenbs uray pajdedxy 


AXxIaCay 


TSW 


ISW 


YSW 
74S 


auenbs 
uray 


“SS 


"SS 


aXVSs 
AXI8SS 


"SS 


ISs 


YSs 
4849S 


sauenbs 
jo wins 


| — «qv 


(I-90 — PAI - 4) 


(1-9) — D) 
(I-91 — 4) 


1-4 


(I — 9)(T — 4) 


[—D 


I[-d 


wopaa4 
jo Saai39q 


[BIOL 


(ax V 
x 7g uondeIN}U]) 


Jouq 10ojdqns 


g X Y uonoelaU 


g X 7g uoNsRIAWU] 


g Joe] 


(VY X 7g UONdeIA}UT) 


JOU 10[g-2]04M, 


Y 10198,] 


xorg 


UOT}eLIeA 
JO 391N0S 


USISIG JO][d-}dS Jy} 40} BDULLULA JO sisAjeUY 


DL'OL ATAVL 


516 The Analysis of Variance 


of differences between two means involving missing values are given by Cochran and 
Cox (1957, pp. 302-303) and are also reported in Steel and Torrie (1980, pp. 388-390). 
A more complete description of this design can be found in the books by Cochran and 
Cox (1957, Chapter 7), Steel and Torrie (1980, Chapter 16), Fleiss (1986, Chapter 13), 
Damon and Harvey (1987, Chapter 7), Snedecor and Cochran (1989, pp. 324-329), and 
Hinkelman and Kempthorne (1994, Chapter 13). 


WorKED EXAMPLE 


Steel and Torrie (1980, p. 387) reported data from an experiment conducted by 
J. W. Lambert, at the University of Minnesota, to compare the effect of row 
spacing on the yields of two varieties of soybean. A split-plot design was used 
with a variety as a whole-plot, which was then divided into four subplots, and 
row spacing was applied to subplots. The varieties as whole-plot treatments 
were allocated in six blocks using a randomized complete block layout. The 
data on yields in bushels per acre for six blocks are given in Table 10.15. 

The analysis of variance computations are readily performed and the results 
are summarized in Table 10.16. The outputs illustrating the applications of 
statistical packages to perform the analysis of variance are presented in Figure 
10.9. In performing tests of significance, blocks and whole-plot error (block x 
variety interaction) are considered as random leading to expected mean squares 
shown in Table 10.16. We may conclude that there are highly significant dif- 
ferences due to both varieties and row spacings. No significant differences 
are found due to either blocks, or block x spacing and variety x spacing 
interactions. 


10.7, OTHER DESIGNS 


The designs described so far in this chapter are relatively simple, commonly 
used designs. There are a great number of other designs that differ mainly due 
to experimental conditions, such as limitations on resources, and the attempt 
to reduce the error variance. In this section, we briefly review some designs 
that are occasionally useful in scientific experimentation. Further details can 
be found in Kempthorne (1952), Federer (1955), Cochran and Cox (1957), and 
Das and Giri (1976). 


INCOMPLETE BLOCK DESIGNS 


In a randomized block design, each treatment must be present in every block. 
However, when there are too many treatments, it may not be possible to ac- 
commodate all factor levels or treatment combinations in each block because 
of limitations of the size of the block (amount of work or space) or lack of 
experimental resources. To overcome this problem, randomized block designs 
are used in which every treatment does not occur in every block. These designs 
are commonly known as incomplete block designs. There are several types of 


Some Simple Experimental Designs 


TABLE 10.15 


Data on Yields of Two Varieties of Soybean 


Row 
Spacing (in.) OM 


18 33.6 
24 31.1 
30 33.0 
36 28.4 
42 31.4 


Variety * 


28.0 
23.7 
23.5 
25.0 
25.7 


2 
Variety 


Block 


3 
Variety 


4 5 


Variety Variety 


Source: Steel and Torrie (1980, p. 387). Used with permission. 
*OM = Ottawa Mandarin, B = Blackhawk. 


TABLE 10.16 


517 


6 
Variety 


Analysis of Variance for the Data on Yields of Two Varieties of 


Soybean 


Source of 
Variation 
Block 


Variety 


Whole-plot error 
(Block x Variety) 


Row spacing 


Block x Spacing 


Variety x Spacing 


Subplot error 
(Block x Variety 
x Spacing) 


Total 


5 


1 


20 


20 


59 


Degrees of Sums of 
Freedom Squares 


30.3588 
477.7082 


15.0388 


206.1043 


87.4137 


25.4543 


107.6237 


949.7018 


Mean Expected 
Square Mean Square F value 
6.0718 o2+502+2x 50% 2.019 
6x5 
477.1082 02 + 502 + — 158.823 
2 
x Yo az 
j=l 
3.0078 o7 + 502 0.559 
> > 6 x 
51.5261 of + 2op,+ =—> 11.789 
5 
x > Br; 
k=] 
4.3707 07 + 20%, 0.812 
6 
6.3636 of + ———-——__ 1.183 
(2 — 1)(5 — 1) 
2 5 
«DL OBYn 
j=l k=) 
5.3812 o2 


p-value 


0.230 
<0.001 


0.730 


<0.001 


0.677 


0.348 


518 The Analysis of Variance 


The SAS System 
INPUT BLOCK SPACING § General Linear Models Procedure 
VARIETY $ YIELD; Dependent Variable: YIELD 
DATALINES; Sum of Mean 
18" 33.6 Source DF Squares Square F Value Pr>F 
24” 31.1 Model 39 842.07816667 21.59174786 4.01 0.0008 
33.0 Error 20 107.62366667 5.38118333 
28.4 Corrected 59 949.70183333 
31.4 Total 
28.0 R-Square C.V. Root MSE YIELD Mean 
23.7 0.886676 8.2518685 2.3197378 28.11166667 
23.5 Source DF Type III SS Mean Square F Value Pr >F 
25.0 BLOCK - 358833 6.071767 1. -3776 
. VARIETY . 708167 477.708167 88. -0001 
22.9 SPACING - 104333 51.526083 . -0002 
BLOCK* VARIETY .038833 3.007767 . -7301 
BLOCK*S PACING -413667 4.370683 . - 6768 
1CLASSES BLOCK VARIETY*SPACING 4 - 454333 6.363583 . - 3486 
{VARIETY SPACING; Source Type III Expected Mean Square 
| MODEL YIELD=BLOCK BLOCK Var (Error) + 2 Var(BLOCK*SPACING)+5 Var (BLOCK*VARIETY) 
VARIETY SPACING + 10 Var (BLOCK) 
BLOCK* VARIETY VARIETY Var (Error) + 5 Var(BLOCK* VARIETY) 
SPACING* BLOCK + Q(VARIETY, VARIETY*SPACING) 
VARIETY*SPACING; SPACING Var (Error) + 2 Var (BLOCK*SPACING) 
RANDOM BLOCK BLOCK* + Q(SPACING, VARIETY*SPACING) 
VARIETY BLOCK*SPACING; | BLOCK*VARIETY Var (Error) + 5 Var(BLOCK*VARIETY) 
TEST H=BLOCK BLOCK* SPACING Var (Error) + 2 Var (BLOCK*SPACING) 
E=BLOCK*VARIETY; VARIETY* SPACING Var(Error) + Q(VARIETY*SPACING) 
TEST H=VARIETY Tests of Hypotheses using the Type III MS for BLOCK*VARIETY 
| E=BLOCK* VARIETY; as an error term 
} TEST H=SPACING Source DF Type III SS Mean Square F Value 
| E=BLOCK* SPACING BLOCK 5 30. 35883333 6.07176667 2.02 
| RUN; Tests of Hypotheses using the Type III MS for BLOCK*VARIETY 
| CLASS LEVELS VALUES as an error term 
BLOCK Source DF Type III SS Mean Square F Value 
VARIETY 1 477.70816667 477.7081666 158.82 


PRPRPeE RE PPP P 


VARIETY 2 B OM 
SPACING 5 18" 24" Tests of Hypotheses using the Type III MS for BLOCK*SPACING 
as an error term 
Source DF Type III SS Mean Square F Value 
SPACING 206.10433333 51.52608333 11.79 


DATA LIST 
/BLOCK 1 


Tests of Between-Subjects Effects Dependent Variable: YIELD 


SPACING 3-4 Source Type III SS df Mean F Sig. 
VARIETY 6 Square 
YIELD 8-11(1). BLOCK Hypothesis 30.359 5 6.072 3.040 .421 
EGIN DATA. Error 1.891 -947 1.997 (a) 
1 33.6 VARIETY Hypothesis 477.708 1 477.708 158.825 .000 
1 31.1 Error 15.039 5 3.008 (b) 
1 33.0 SPACING Hypothesis 206.104 4 51.526 11.789 .000 
1 28.4 Error 87.414 20 4.371 (c) 
131.4 BLOCK* Hypothesis 87.414 20 4.371 -812 .677 
2 28.0 SPANCING Error 107.624 20 5.381 (d) 
2 23.7 BLOCK* Hypothesis 15.039 5 3.008 -559 .730 
2 23.5 VARIETY Error 107.624 20 5.381 (d) 
2 25.0 SPACING* Hypothesis 25.454 4 6.364 1.183 .349 
2 25.7 VARIETY Error 107.624 20 5.381 (d) 
1 37.1 a MS(B*S)+MS(B*V)-MS(E) b MS(B*V) c MS(B*S) d MS(Error) 
1 34.5 
1 29.5 Expected Mean Squares (a,b) 
1 29.9 Variance Component 
1 Source Var(B) Var(B*S) Var(B*V) Var(Error) Quadratic Term 


BLOCK 10.000 2.000 5.000 1.000 

VARIETY .000 .000 5.000 1.000 Variety 
SPACING .000 2.000 .000 1.000 Spacing 
BLOCK* SPACING 000 2.000 .-000 1.000 

BLOCK* VARIETY .000 . 000 5.000 1.000 

SPACING* VARIETY .000 . 000 .000 1.000 Variety*Spacing 


fGLM YIELD BY BLOCK 
}SPACING VARIETY 
/DESIGN=BLOCK 


VARIETY SPACING Error .000 -000 .000 1.000 

BLOCK*S PACING a For each source, the expected mean square equals the sum of the 
BLOCK* VARIETY coefficients in the cells times the variance components, plus a| 
SPACING* VARIETY quadratic term involving effects in the Quadratic Term cell. b Expected 


| /RANDOM BLOCK. Mean Squares are based on the Type III Sums of Squares. 


(ii) SPSS application: SPSS GLM instructions and output for the split-plot design. 


FIGURE 10.9 Program Instructions and Output for the Split-Plot Design: Data 
on Yields of Two Varieties of Soybean (Table 10.15). 


Some Simple Experimental Designs 519 


FILE='C: \SAHKAI BMDP8V - GENERAL MIXED MODEL ANALYSIS OF VARIANCE 
\TEXTO\EJE27.TXT’. - EQUAL CELL SIZES Release: 7.0 (BMDP/DYNAMIC) 
FORMAT=FREE. ANALYSIS OF VARIANCE FOR DEPENDENT VARIABLE 1 
VARIABLES=5. 
NAMES=S1,52,S3,S4, SOURCE ERROR SUM OF D.F. MEAN F PROB. 
55. TERM SQUARES SQUARE 
NAMES=BLOCK, MEAN BLOCK 47415.9477 1 47415.948 7809.25 0.0000 
VARIETY, BLOCK -3588 5 6.072 
SPACING. VARIETY -7082 1 477.708 158.83 0.0001 
, LEVELS=6, 2, 5. SPACING -1043 4 51.526 11.79 0.0000 
RANDOM=BLOCK. .0388 5 3.008 
FIXED=VARIETY, -4137 2 4.371 
SPACING. -4543 4 6.364 1.18 0.3486 
MODEL='B, V, S'. -6237 2 5.381 


@OArnNnU &®WN eH 


33.0 28.4 31.4 EXPECTED MEAN ESTIMATES OF VARIANCE 
23.5 25.0 25.7 SQUARE COMPONENTS 
60(1)+10(2) 790.16460 
10(2) 0.60718 
30(3)+5(5) 15.82335 
12 (4) +2 (6) .92962 
5 (5) .60155 
2(6) .18534 
6(7)+(8) .16373 
.38118 


136.1 30.3 27.9 26.9 33.4 
28.3 23.8 22.0 24.5 22.9 

| ANALYSIS OF VARIANCE DESIGN 
| BV Ss 
| NUMBER OF LEVELS 2 5 
POPULATION SIZE INF 2 5 
MODEL B, V, S 


OArxAnNU 2®WNDN He 


FIGURE 10.9 (continued) 


incomplete block designs, the simplest of which involve blocks of equal size 
and all treatments equally replicated. If an incomplete block design has t treat- 
ments, b blocks with c experimental units within each block and there are r 
replications of each treatment, then the number of times any two treatments 
appear together in a block is A = r(c — 1)/(t — 1) = n(c — 1)/t(t — 1) where 
n = tr. When it is desired to make all treatment comparisons with equal preci- 
sion, the incomplete block designs are formed such that every pair of treatments 
occurs together the same number of times. Such designs are called balanced 
incomplete block designs and were originally proposed by Yates (1936a). For 
a list of some useful balanced incomplete block designs, see Box et al. (1978, 
pp. 270-274). Balanced incomplete block designs do not always exist or may 
result in excessively large block sizes. To reduce the number of blocks required 
in an experiment, the experimenter can employ designs known as partially 
balanced incomplete block designs in which different pairs of treatments ap- 
pear together a different number of times. For further discussions of incomplete 
block designs, see Cochran and Cox (1957, Chapters 9 and 13) and Cox (1958a, 
pp. 231-245). 


LATTICE DESIGNS 


Lattice designs are a class of incomplete block designs introduced by Yates 
(1937b) to increase the precision of treatment comparisons in agricultural 
crop cultivate trials. The designs are also sometimes called quasi-factorials 
because of their analogy to confounding in factorial experiments. For example, 
if k? treatments are to be compared, one can arrange them as the points of a 


520 The Analysis of Variance 


two-dimensional lattice and regard the points as representing the treatments 
in a two-factor experiment. Suppose a balanced incomplete block layout with 
k? treatment is arranged in b = k(k + 1) blocks with k units per block and 
r =k + 1 replicates for each treatment. Such a design is called a balanced lat- 
tice. In a balanced lattice, the number of treatments 1s always an exact square 
and the size of the block is the square root of this number. Incomplete lattice 
designs are grouped to form separate replications. In a balanced incomplete 
lattice, every pair of treatments occurs once in the same incomplete block. This 
allows the same degree of precision for all treatment pairs being compared. Lat- 
tice designs may involve a large number of treatments and 1n order to reduce 
the size of the design, partially balanced lattice designs are also used. Further 
details of the lattice designs are given in Kempthorne (1952), Federer (1955), 
and Cochran and Cox (1957). The SAS PROC LATTICE performs the analysis 
of variance and analysis of simple covariance using experimental data obtained 
from a lattice design. The procedure analyzes data from balanced square lat- 
tices, partially balanced square lattices, and some other rectangular lattices. 
For further information and applications of PROC LATTICE, see SAS Institute 
(1997, Chapter 14). 


YOUDEN SQUARES 


Youden squares are constructed by a rearrangement of certain of the balanced 
incomplete block designs and possess the property of “two-way control’ of 
Latin squares. They are special types of incomplete Latin squares in which the 
number of columns, rows, and treatments are not all equal. If a column or row 
is deleted from a Latin square, the remaining layout is always a Youden square. 
However, omission of two or more rows or columns does not in general produce 
a Youden square. Youden squares can also be thought of as symmetrically 
balanced incomplete block designs by means of which two sources of variation 
can be controlled. These designs were developed by Youden (1937, 1940) in 
investigations involving greenhouse experiments. The name Youden square was 
given by Yates (1936b). The standard analysis of variance of a Youden square 
design is similar to that of a balanced incomplete randomized block design. 
A detailed treatment of planning and analysis of Youden squares is given in 
Natrella (1963, Section 13.6). A table of Youden squares is given in Davies 
(1960) and other types of incomplete Latin squares are discussed by Cochran 
and Cox (1957, Chapter 13). 


CROSS-OVER DESIGNS 


In most experimental designs, each subject is assigned only to a single treatment 
during the entire course of the experiment. In a cross-over design, the total dura- 
tion of the experiment is divided into several periods and the treatment of each 
subject changes from each period to the next. In a cross-over study involving 
k treatments, each treatment is allocated to an equal number of subjects and is 
applied to each subject in k different time periods. Since the order of treatment 


Some Simple Experimental Designs 521 


assignment to experimental units may have some consequences regarding the 
effectiveness of different treatments, the order of treatment is chosen randomly 
so as to eliminate the order effects. This type of design is particularly suited 
for animal and human subjects. The intervening period between the assignment 
of different treatments depends on the objectives of the experiment and other 
experimental considerations. For example, suppose in an experiment involving 
human subjects, the effect of two treatments is investigated. In the first period, 
half of the subjects are randomly assigned to treatment 1 and the other half to 
treatment 2. At the end of the study period, the subjects are evaluated for the 
desired response and sufficient time is allowed so that the biological effect of 
each treatment is eliminated. In the second period, the subjects who were as- 
signed treatment 1 are given treatment 2 and vice versa. The cross-over designs 
can be analyzed as a set of Latin squares with rows as time periods, columns as 
subjects, and treatments as letters. Cross- over designs have been used success- 
fully in clinical trials, bioassay, and animal nutrition experiments. For further 
discussions of cross-over designs see Cochran and Cox (1957, Section 4.4), Cox 
(1958a, Chapter 13), John (1971, Chapter 6), John and Quenouille (1977, Chap- 
ter 11), Fleiss (1986, Chapter 10), Jones and Kenward (1989), Senn (1993), and 
Ratkowski et al. (1993). 


REPEATED MEASURES DESIGNS 


Any design involving k (k < 2) successive measurements on the same subject 
is called a repeated measures design. In a repeated measures design, subjects 
are crossed with the factor involving repeated measures. The k measurements 
may correspond to different times, trials, or experimental conditions. For ex- 
ample, blood pressures may be measured at successive time periods, say, once 
a week, for a group of patients attending a clinic, or animals injected with dif- 
ferent drugs and measurements made after each injection. If possible, the order 
of assignment of k repeated measures should be selected randomly. Of course, 
when repeated measures are taken in different time sequences, it is not possible 
to include randomization. In repeated measures designs, each subject acts as 
his or her own control. This helps to control for variability between subjects 
since the same subject is measured repeatedly. Thus, repeated measures designs 
are used to control for the presence of many extraneous factors while at the same 
time limiting the total number of experimental units. A major concern in re- 
peated measures designs are that no carry-over or residual effects are present 
from treatment at one time period to response at the next time period. Thus, 
as in the case of cross-over designs, sufficient time must be allowed to elim- 
inate any carry-over effect from the previous treatment. When this cannot be 
achieved, cross-over designs are to be preferred. It is important to point out 
that it is incorrect to analyze the time dimension in repeated measures studies 
by the straightforward application of the analysis of variance. For a complete 
coverage of repeated measures designs, see Fleiss (1986, Chapter 8), Maxwell 
and Delaney (1990), Winer et al. (1991), and Kirk (1995). For a book-length 


522 The Analysis of Variance 


treatment of the topic see Crowder and Hand (1990) and Lindsey (1993). The lat- 
ter work also includes a fairly extensive and classified bibliography on repeated 
measures. Hedayat and Afsarinejad (1975, 1978) have given an extensive sur- 
vey and bibliography on repeated measures designs. For analysis of repeated 
measures data using SAS procedures, PROC GLM and PROC MIXED, see 
Littell et al. (1996, Chapter 3). 


HyYPER-GRAECO-LATIN AND HYPER SQUARES 


The principle of Latin and Graeco-Latin square designs can be further extended 
to control for four or more sources of variation. Hyper-Graeco-Latin square is 
a design which can be used to control four sources of variation. The design can 
also be used to investigate simultaneous effects of five factors: rows, columns, 
Latin letters, Greek letters, and Hebrew letters, in a single experiment. The 
hyper-Graeco-Latin square design is obtained by juxtaposing or superimposing 
three Latin squares, one with treatments denoted by Greek letters, the second 
with treatments denoted by Latin letters, and the third with treatments denoted 
by Hebrew letters, such that each Hebrew letters appears once and only once 
with each Greek and Latin letters. The number of Latin squares that can be com- 
bined in forming hyper-Graeco-Latin squares is limited. For example, no more 
than three orthogonal 4 x 4 Latin squares can be combined and no more than 
four orthogonal 5 x 5 Latin squares can be combined. The sum of squares for- 
mulae for rows, columns, Greek letters, Latin letters, and Hebrew letters follow 
the same general pattern as the corresponding formulae in Latin and Graeco- 
Latin square designs. The concept of superimposing two or more orthogonal 
Latin squares in forming Graeco-Latin and hyper-Graeco-Latin squares can be 
extended even further. A p x p hypersquare is a design in which three or more 
orthogonal p x p Latin squares are superimposed. In general, one can investigate 
a maximum of p+ 1 factors if acomplete set of p — 1 orthogonal Latin squares 
is available. In such a design, one would utilize all (p + 1)(p — 1) = p’—-1 
degrees of freedom, so that an independent estimate of the error variance would 
be required. Of course, the researcher must assume that there would be no in- 
teractions between factors when using hypersquares. For a detailed discussion 
of hyper-Graeco-Latin squares, and other hypersquares, see Federer (1955). 


MAGIC AND SUPER MAGIC LATIN SQUARES 


These are Latin square designs with additional restrictions placed on the group- 
ing of treatments within a Latin square in order to reduce the error term. For 
this purpose, additional smaller squares or rectangles are formed within a Latin 
square in order to remove additional variation from the error term. If the use of 
squares or rectangles to remove variation is done in only one direction, the de- 
sign is called a magic Latin square. If the technique is used to control variation 
in both directions, the design is a called super magic Latin square. These designs 
were initially developed by Gertrude M. Cox and have been used in sugarcane 
research in Hawaii and at the Geneva Experimental Station in New York. 


Some Simple Experimental Designs 523 


SPLIT-SPLIT-PLOT DESIGN 


In a split-plot design, each subplot may be further subdivided into a number 
of sub-subplots to which a third set of treatments corresponding to c levels 
of a factor C may be applied. In a split-split-plot design, three factors are 
assigned to the various levels of experimental units, using three distinct stages 
of randomization. Thea levels of factor A are randomly assigned to whole-plots; 
b levels of factor B are randomly assigned to subplots within a whole-plot; and 
c levels of factor C are randomly assigned to sub-subplots within a subplot. 
For such a design there will be three error variances: whole-plot error for the 
A treatments, subplot error for the B treatments, and sub-subplot error for 
the C treatments. The details of statistical analysis follow the same general 
pattern as that of the split-plot design. Finally, it should be noted that in a 
split-split-plot design, the three error sums of squares and their corresponding 
degrees of freedom would add up to the sum of squares and the degrees of 
freedom for the single error term if the experiment were conducted in a standard 
randomized block design of abc units. For further information about the split- 
split-plot design, see Anderson and McLean (1974, Section 7.2) and Koch et al. 
(1988). 


2? DESIGN AND FRACTIONAL REPLICATIONS 


In many experimental works involving a large number of factors, a very useful 
factorial design for preliminary exploration is a 2? design. This design has 
p treatment factors, each having two levels, giving a total of 2? treatment 
combinations. Thus, in any replication of this design, 2? experimental units are 
required. In a 2? design, there are p main effects, (5) two-way interactions, (3) 
three-way interactions, etc., and finally one p-way interaction. Note that all the 
main effects and each one of the interactions have only one degree of freedom. 
If the design is replicated in b blocks each containing 2? experimental units, 
then there are (2? — 1)(b — 1) degrees of freedom available for the error term. If 
the number of experimental units available is limited, it may not be possible to 
replicate the design. In such acase there will ble no degrees of freedom available 
for the error term. However, if the higher-order interactions can be assumed to 
be negligible, which is often the case, one can pool the sums of squares for these 
interactions in order to obtain an estimate of the error mean square. If some 
of the higher-order interactions are not zero, the F test for the main effects 
and the lower-order interactions will tend to be conservative. If there are large 
number of treatment factors and the available resources are limited, it may be 
necessary to use a replication of only a fraction of the total number of treatment 
combinations. In a design involving a fractional replication, some of the effects 
cannot be estimated since they are confounded with one or more other effects. 
Usually, the choice of a fractional replication is made such that the effects 
considered to be of importance are confounded only with the effects that can be 
assumed to be negligible. For a complete discussion of 2? and other factorial 


524 The Analysis of Variance 


designs, and their fractional replications, see Kempthorne (1952), Cochran and 
Cox (1957), and Box et al. (1978). 


10.8 USE OF STATISTICAL COMPUTING PACKAGES 


Completely randomized designs (CRD) can be analyzed exactly as the one-way 
analysis of variance. The use of SAS, SPSS, and BMDP programs for this anal- 
ysis is described in Section 2.15. Similarly, randomized block designs (RBD) 
can be analyzed exactly as the two-way analysis of variance with n(n > 1) 
observations per cell. The use of appropriate programs for this type of analysis 
is described in Sections 3.17 and 4.17. Latin squares (LSD) and Graeco-Latin 
squares (GLS) can be analyzed similar to three-way and four-way crossed- 
classification models without interactions. For example, with SAS, one can use 
either PROC ANOVA or PROC GLM for all of them. The important instruc- 
tions for both procedures are the CLASS and the MODEL statements. For the 
CRD, these are 


CLASS TRT; 
MODEL Y = TRT; 


and for the RBD, these are 


CLASS BLC TRI; 
MODEL Y = BLC TRT; 


where TRT, BLC, and Y designate treatment, block, and response, respectively. 
Similarly, for the LSD, we have 


CLASS ROW COL TRI; 
MODEL Y = ROW COL TRT; 


and, for the GLS, we have 


CLASS ROW COL GRG ROM; 
Y = ROW COL GRG ROM; 


where ROW, COL, GRG, and ROM designate row, column, Greek letter, and 
Roman letter factors, respectively. For the split-plot design, these statements 
are 


CLASS BLC A B 
MODEL Y = BLC A BLC*A BBLC*B A*B; 


where BLC stands for blocks (replications) and A and B are whole-plot and 
subplot treatments, respectively. 


Some Simple Experimental Designs 525 


If some of the factors are to be treated as random and the researcher is 
interested in estimating variance components, one may employ appropriate 
procedures in SAS, SPSS, and BMDP for this purpose. For example, in a Latin 
square with rows and columns regarded as random factors, the following SAS 
codes may be used to analyze the design via PROC MIXED procedure: 


PROC MIXED; 
CLASS ROW COL TRT; 
MODEL Y = TRT; 
RANDOM ROW COL; 
RUN; 


EXERCISES 


1. An experiment was designed to compare four different feeds in regard 
to the gain in weight of cattle. Twenty cattle were divided at random 
into four groups of five each and each group was placed on a different 
feed. After a certain duration of time, the weight gains in kilograms 
for each of the cattle was recorded and the data given as follows. 


Feed A Feed B-  FeedC Feed D 


34 64 111 96 
45 49 34 85 
35 52 122 91 
49 47 27 88 
44 58 29 94 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
weight gains for all the feed are the same. Use a = 0.05. 

(d) If the hypothesis in part (c) is rejected, find 95 percent simulta- 
neous confidence intervals for the contrasts between each pair 
of feeds using Tukey’s and Scheffé’s methods. 

(e) Carry outthe test for homoscedasticity ata = 0.01 by employing 


(i) Bartlett’s test, 
(ii) Hartley’s test, 
(111) Cochran’s test. 


2. Three methods of teaching were compared to determine their compar- 
ative value on student’s learning ability. Thirty students of comparable 
ability were randomly divided into three groups of 10 each, and each 
group received instruction using a different method. After completion 
of instruction, the learning score for each student was determined and 


526 


The Analysis of Variance 


the data given as follows. 


Method A Method B Method C 


161 179 134 
131 261 176 
186 311 153 
281 176 186 
213 196 131 
155 163 131 
221 221 157 
167 232 164 
19] 264 175 
216 259 133 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
learning scores for the three methods are the same. Use a = 0.05. 

(d) If the hypothesis in part (c) is rejected, find 95 percent simulta- 
neous confidence intervals for the three single contrasts between 
each pair of teaching methods using Tukey’s and Scheffé’s pro- 
cedures. Use a = 0.01. 

(e) Carry out the test for homoscedasticity ata = 0.01 by employing 


(1) Bartlett’s test, 
(ii) Hartley’s test, 
(11) Cochran’s test. 


Steel and Torrie (1980, p. 144) reported data from an experiment, con- 
ducted by F. R. Urey, Department of Zoology, University of Wisconsin, 
on estrogen assay of several solutions that had been subjected to an 
in vitro inactivation technique. Twenty-eight rats were randomly as- 
signed to six different solutions and a control group and the uterine 
weight of the rat was used as a measure of the estrogen activity. The 
uterine weight in milligrams for each rat was recorded and the data 
given as follows. 


1 2 3 4 5 6 Control 
84.4 64.4 75.2 88.4 56.4 65.6 89.8 

116.0 79.8 62.4 90.2 83.2 79.4 93.8 
84.0 88.0 62.4 73.2 90.4 65.6 88.4 
68.6 69.4 73.8 87.8 85.6 70.2 112.6 


Source: Steel and Torrie (1980, p. 144). Used with permission. 


Some Simple Experimental Designs 527 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
uterine weights for all the groups are the same. Use a = 0.05. 

(d) Ifthe hypothesis in part (c) is rejected, find 95 percent confidence 
intervals for the single contrast comparing control with the mean 
of all the other six treatments and interpret your results. Use 
a = 0.05. 

(e) Carry out the test for homoscedasticity ata = 0.01 by employing 


(i) Bartlett’s test, 
(11) Hartley’s test, 
(111) Cochran’s test. 


4. Fisher and McDonald (1978, p. 45) reported data from an experiment 
designed to study the effect of experience on errors in the reading of 
chest x-rays. Ten radiologists participated in the study and were clas- 
sified into one of three groups: senior staff, junior staff, and residents. 
Each radiologist was asked whether the left ventricle was normal and 
the response was compared to the results of ventriculography. The 
percentage of errors for each radiologist was determined and the data 
given as follows. 


Senior Staff —_ Junior Staff Residents 

7.3 13.3 14.7 

7.4 10.6 23.0 
15.0 22.7 
20.7 26.6 


Source: Fisher and McDonald (1978, p. 46). 
Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
percentage errors for the three groups of radiologists are the 
same. Use a = 0.05. 

(d) Transform each percentage to its arcsine value and then perform 
a second analysis of variance on the transformed data, comparing 
the results with those obtained in part (c). 

5. Lorenzen and Anderson (1993, p. 46) reported data from an experiment 

designed to study the effect of honey on haemoglobin in children. A 

completely randomized design was used with 12 children, 6 given a 


528 The Analysis of Variance 


tablespoon of honey added to a cup of milk, and 6 not given honey 
over a period of six straight weeks. The data are given as follows. 


Honey Control 
19 14 
12 8 

9 4 
17 4 
24 11 
22 15 


Source: Lorenzen and Anderson 
(1993, p. 46). Used with 
permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
haemoglobin levels in two groups of children are the same. Use 
a = 0.05. 

(d) Carry out the test for homoscedasticity using the Snedecor’s F 
test. Use a = 0.05. 

6. A randomized block experiment was conducted with five treatments 
and five blocks. The following table gives partial results on analysis 
of variance. 


Source of Degreesof Sumof Mean 


Variation Freedom Squares Square FValue p-Value 
Treatment 150.2 
Block 49.1 
Error — 
Total 295.2 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Complete the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that all the 
treatment means are the same. Use a = 0.05. 

7. An experiment is designed to compare mileage of four brands of gaso- 
line. Inasmuch as mileage will vary according to road and other driving 
conditions, five different categories of driving conditions are included 
in the experiment. A randomized block design is used and each brand 
of gasoline is randomly selected to fill five cars. Finally, each car is 


Some Simple Experimental Designs 


529 


randomly assigned a given driving condition. The data are given as 


follows. 


Driving 


condition B, 


D, 
D2 
D3 
Dg 
Ds 


(a) Describe the mathematical model and the assumptions for the 


experiment. 


(b) Analyze the data and report the analysis of variance table. 
(c) Do the brands have a significant effect on the mileage? Use 


a = 0.05. 


(d) Do the driving conditions have a significant effect on mileage? 


Use a = 0.05. 


(e) If there are significant differences in mileage due to brands, use 
a suitable multiple comparison procedure to determine which 
brands differ. Use w = 0.01. 


48.1 
34.6 
47.6 
30.6 
39.6 


Brand of Gasoline 


By 


39.1 
46.1 
48.6 
43.6 
45.1 


By 


51.6 
45.6 
41.6 
39.6 
44.1 


Bg 


49.1 
35.1 
39.6 
31.6 
21.1 


8. An agricultural experiment was designed to study the potential for 
grain yield of four different varieties of wheat. A randomized block 
design with five blocks was used and each variety was planted in each 


of the blocks. The data on yields are given as follows. 


Block 


nN & WN = 


(a) Describe the mathematical model and the assumptions for the 


experiment. 


(b) Analyze the data and report the analysis of variance table. 
(c) Do the varieties have a significant effect on the yield? Use 


a = 0.05. 


(d) Do the blocks have a significant effect on the yield? Use a 


0.05. 


(e) If there are significant differences in yields due to varieties, use 
a suitable multiple comparison method to determine which va- 


152.4 
154.1 
154.4 
155.1 
156.5 


rieties differ. Use aw = 0.01. 


Variety 
il Wil 
153.4 150.9 
152.8 154.4 
156.3 155.3 
156.1 152.3 
154.6 155.2 


IV 


145.5 
146.1 
149.8 
148.1 
148.9 


530 


The Analysis of Variance 


9. An experiment was designed to study the reaction time among rats 


10. 


under the influence of three different treatments. Four rats were chosen 
for the experiment and three treatments were administered on each rat 
on three different days, and the order in which each rat received a 
treatment was random. The data on reaction time in seconds are given 
as follows. 


Treatment 


Rat A B C 


6.8 115 84 
5.6 97 7.3 
2.2 74 3.9 
3.7 8.3 5.7 


be Wh = 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Do the treatments have a significant effect on the reaction time? 
Use a = 0.05. 

(d) Do the rats have a significant effect on the reaction time? Use 
a = 0.05. 

(e) If there are significant differences in reaction time due to treat- 
ments, use a suitable multiple comparison method to determine 
which treatments differ. Use a = 0.01. 

Anderson and Bancroft (1952, p. 245) reported data from an exper- 

iment conducted by Middleton and Chapman at Laurinburg, North 

Carolina, to compare eight varieties of oats. The experiment involved 

a randomized block design with five blocks and the yields of grain in 

grams for a 16-foot row were recorded. The data are given as follows. 


Variety 


Block I i tl IV Vv Vi Vil vill 


296 402 437 303 469 345 324 488 
357 390 334 319 405 342 339 374 
340 431 426 310 442 358 357 401 
331 340 320 260 487 300 352 338 
348 320 296 242 394 308 220 320 


na & WN = 


Source: Anderson and Bancroft (1952, p. 245). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
yields for all the varieties are the same. Use a = 0.05. 


Some Simple Experimental Designs 531 


(d) If there are significant differences in mean yields among the var1- 
eties, use a suitable multiple comparison procedure to determine 
which varieties differ. Use a = 0.01. 

(e) What is the efficiency of this design compared with a completely 
randomized design? 

11. Fisher and McDonald (1978, p. 66) reported data from an experiment 
designed to study the effect of different heat treatments of the dietary 
protein of young rats on the sulfur-containing free amino acids in 
the plasma. The experiment involved a randomized block design with 
three blocks and six different treatments of heated soybean protein. 
The plasma-free crystine levels in rats fed on different treatments were 
recorded (4 moles/100 m2) and the data are given as follows (where 
each observation is the average for four rats). 


Heat Treatment 


Block I i il IV Vv Vi 


4.0 4.0 4.1 3.8 4.5 3.8 
4.6 5.7 5.2 4.9 5.6 5.3 
49 6.1 5.4 5.2 5.9 5.7 


Source: Fisher and McDonald (1978, p. 66). Used with 
permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
levels of plasma-free crystines in rats fed on different treatments 
are the same. Use a = 0.05. 

(d) If there are significant differences in mean levels of plasma-free 
crystines due to treatments, use a suitable multiple comparison 
method to determine which treatments differ. Use a = 0.01. 

(e) What is the efficiency of this design compared with a completely 
randomized design? 

12. John (1971, p. 64) reported data from an experiment involving a ran- 
domized block design with three blocks and 12 treatments including a 
control. The yields, in ounces, of cured tobacco leaves were recorded 
and the data are given as follows. 


Treatment 
Block | Il Wl IV Vv vi s=OVU vill IX D4 »¢| Control 
1 76 82 76 70 $76 #70 82 88 81 74 #67 79 
2 70 70 73 74 # 73 83 74 65 67 67 67 78 


80 73 77 62 86 84 80 80 81 76 79 63 


Source: John (1971, p. 64). Used with permission. 


532 


13. 


14. 


The Analysis of Variance 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
yields for all the treatments are the same. Use a = 0.05. 

(d) If the hypothesis in part (c) is rejected, use Dunnett’s procedure 
to test differences between the control and each of the other 
treatment means. Use a = 0.01. 

(ec) As an alternative to Dunnett’s procedure, one might wish to 
compare control versus the mean of the other 11 treatments. Set 
up the necessary contrast and test the implied null hypothesis. 
Use a = 0.01. 

John and Quenouille (1977) reported data from a randomized block 

experiment to test the efficacy of five levels of application of potash on 

the Pressley strength index of cotton. The levels of potash consisted of 
pounds of KO per unit area, expressed as units, and the experiment 
was carried out in three blocks. The data are as follows. 


Treatment 
Block | HH Wl IV Vv 
1 7.62 8.14 7.76 7.17 7.46 
2 8.00 8.15 7.73 7.57 7.68 
3 7.93 7.87 7.74 7.80 7.21 


Source: John and Quenouille (1977). Used with 
permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Do the treatments have a significant effect on the strength of 
cotton? Use a = 0.05. 

(d) Do the blocks have a significant effect on the strength of cotton? 
Use a = 0.05. 

(e) If there are significant differences in mean levels of the Pressley 
index of cotton, use a suitable multiple comparison method to 
determine which treatments differ. Use a = 0.01. 

Snee (1985) reported data from an experiment designed to investi- 

gate the effect of a drug added to the feed of chicks on their growth. 

There were three treatments: standard feed (control group), standard 

feed and a low dose of drug, standard feed and a high dose of drug. 

The experimental units included a group of chicks fed and reared in 

the same bird house. Eight blocks of three experimental units each were 

laid out with physically adjacent units assigned to the same block. The 


Some Simple Experimental Designs 


15. 


333 


data are given in the following where each observation is the average 


weight (Ibs) per bird at maturity. 


Block 


On nA oT h& WwW NH = 


Source: Snee (1985). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 


experiment. 


(b) Analyze the data and report the analysis of variance table. 
(c) Perform an appropriate F test for the hypothesis that the mean 
weights of chicks fed on different treatments are the same. Use 


a = 0.05. 


(d) If there are significant differences in mean weights due to treat- 
ments, use a Suitable multiple comparison method to determine 
which treatments differ. Use a = 0.01. 

(e) What is the efficiency of this design compared with a completely 


Control 


3.93 
3.78 
3.88 
3.93 
3.84 
3.75 
3.98 
3.84 


randomized design? 
Steel and Torrie (1980, p. 202) reported unpublished data, courtesy 
of R. A. Linthurst and E. D. Seneca, North Carolina State University, 
Raleigh, North Carolina (paper title, “Aeration, Nitrogen, and Salin- 
ity as Determinants of Spartina alterniflora Growth Response,), who 
conducted a greenhouse experiment on the growth of Spartina alterni- 
flora in order to study the effects of salinity, nitrogen and aeration. The 
dried weight of all aerial plant material was recorded and the data are 


given as follows. 


Block I il lil 

1 11.8 18.8 21.3 
2 8.1 15.8 22.3 
3 22.6 37.1 19.8 
4 4.1 22.1 49.0 


IV 


83.3 
25.3 
55.1 
47.6 


Vv 


8.8 
8.1 
2.1 
10.0 


Treatment 


3.99 
3.96 
3.96 
4.03 
4.10 
4.02 
4.06 
3.92 


Low dose 


Treatment* 
Vi Vil 
26.2 20.4 
19.5 8.5 
17.8 8.2 
20.3 4.8 


* Treatment combinations are defined as follows. 


High dose 


3.96 
3.94 
4.02 
4.06 
3.94 
4.09 
4.17 
4.12 


Vill IX 
50.2. 2.2 
47.7 = 3.3 
16.4 11.1 
25.8 = 2.7 


15.3 
10.2 


534 


16. 


The Analysis of Variance 


Treatment Code 


Number I ut il IV Vv vi Vit Vill xX xX Xi XI 


Salinity IS 15 15 15 30 30 30 30 45) = 45 45 45 
parts/thousand 

Nitrogen 0 oO 168 168 OO O 168 168 0 O 168 168 
kg/hectare 

Aeration 0 | 0 | 0 1 0 | 0 1 0 1 
(0 = none, 


1 = saturation) 


Source: Steel and Torrie (1980, p. 202). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that there are 
no differences in response due to treatments. Use a = 0.05. 

(d) Perform an appropriate F test for the hypothesis that there are 
no differences in response due to differences in salinity. Use 
a = 0.05. 

(e) Perform an appropriate F test for the hypothesis that there are no 
differences in response due to differences in nitrogen treatments. 
Use a = 0.05. 

(f) Perform an appropriate F test of the hypothesis that there are no 
differences in response due to differences in aeration treatments. 
Use a = 0.05. 

(g) Is the nitrogen contrast orthogonal to the aeration contrast? 
A researcher wants to study two treatment factors A and C using a 
factorial arrangement along with a randomized block design. Assume 
that the factors A and C have a and c levels respectively giving a total 
of ac treatment combinations. There are b blocks and each treatment 
combination is randomly assigned to ac experimental units within 
each block. The mathematical model for this design is given by 


-1=1,2,...,b 
Vijk =~UtBetaytyetay)j tex YJ =1,2,...,4 
k=1,2,...,¢, 


? 9 


where yj; is the observed response corresponding to the i-th block, the 

j-th level of factor A, and the k-th level of factor C; —0o < pb < © 

is the overall mean, f; is the effect of the i-th block, a; is the effect 

of the j-th level of factor A, ), is the effect of the k-th level of factor 

C, (ay) jx 1s the interaction between the j-th level of factor A and the 

k-th level of factor C, and e;;, 1s the customary error term. 

(a) State the assumptions of the model if all the effects are considered 
to be fixed. 

(b) State the assumptions of the model if all the effects are considered 
to be random. 


Some Simple Experimental Designs 535 


17. 


18. 


(c) State the assumptions of the model if the block effects are con- 
sidered to be random and A and C effects are considered to be 
fixed. 

(d) Report the analysis of variance table including expected mean 
squares under the assumptions of fixed, random, and mixed mod- 
els as stated in parts (a) through (c). 

(e) Assuming normality for the random effects in parts (a) through 
(c), develop tests of hypotheses for testing the effects correspond- 
ing to the block, factors A and C, and the A x C interaction. 

(f) Determine the estimators of the variance components based on 
the analysis of variance procedure under the assumptions of the 
random and mixed model. 

A psychological experiment was designed to study the effect of five 

learning devices. In order to control for directional bias on learning, 

a Latin square design was used with five subjects using five different 

orders. The test scores are given in the following where the letter within 

parentheses represents the learning device used. 


Order of Test 
Subject 1 2 3 4 5 


105 (D) 195 (C) 185 (A) 135 (E) 170 (B) 
165 (C) 185 (B) 150 (E) 190 (A) 150 (D) 
155 (A) 150 (D) 185 (C) 155 (B) 85 (E) 
165 (B) 195 (E) 135 (D) 110 (C) 105 (A) 
245 (E) 240 (A) 170 (B) 175 (D) 135 (C) 


ah WN = 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
test scores for all the devices are the same, and state which device 
you would recommend for use. Use a = 0.05. 

(d) Use Tukey’s multiple comparison method to determine whether 
there are significant differences among the top three learning 
devices. Use a = 0.01. 

(ce) Comment on the usefulness of the Latin square design in this 
case. 

(f) What is the efficiency of the design compared to the randomized 
block design? 

An experiment was designed to compare grain yields of five different 

varieties of corn. A 5 x 5 Latin square design was used to control for 

fertility gradients due to rows and columns. The data on yields were 
given as follows where the letter within parentheses represents the 
variety of the corn. 


536 


19. 


The Analysis of Variance 


Column 


Row 1 2 3 4 5 


65.9(A) 66.0(D) 68.0(B) 674(E)  63.4(C) 
68.2(B) 67.6(E) 686(A) 686(C) 66.4(D) 
68.6(C) 68.3(B) 68.0(E) 67.2(D)  69.3(A) 
64.1(D) 62.9(A) 66.2(C) 67.0(B) 66.8(E) 
61.2(E) 62.7(C) 61.6(D) 67.8(A)  64.9(B) 


Oem wWN = 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
yields for all the varieties are the same, and state which variety 
you would recommend for planting. Use a = 0.05. 

(d) Use Tukey’s multiple comparison method to determine whether 
there are significant differences among the top three varieties. 
Use a = 0.01. 

(ec) Comment on the usefulness of the Latin square design in this 
case. 

({) What is the efficiency of the design compared to the randomized 
block design? 

Anderson and Bancroft (1952, p. 247) reported data from an experi- 

ment conducted at the University of Hawaii to compare six different 

legume intercycle crops for pineapples. A Latin square design was used 
and the data on yields in 10-gram units were given as follows where 
the letter within parentheses represents the variety of the legume. 


Column 


Row 1 2 3 4 5 6 


1 220 (B) 98 (F) 149 (D) 92 (A) 282 (E) 169 (C) 
2 74(A) 238(E) 158 (B) 228 (C) 48 (F) 188 (D) 
3 118(D) =. 279 (C) 118(F) 278(E) 176 (B) 65 (A) 
4 295(E) 222 (B) 54 (A) 104(D) =. 213 (C) 163 (F) 
5 187 (C) 90(D) 242(E) 96 (F) 66 (A) 122 (B) 
6 90 (F) 124 (A) 195 (C) 109 (B) 79 (D) 211 (E) 


Source: Anderson and Bancroft (1952, p. 247). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
yields for all the varieties are the same. Use a = 0.05. 


Some Simple Experimental Designs 


20. 


21. 


(d) Use Tukey’s multiple comparison method to make pairwise com- 
parisons among the three legumes with the highest yield. Use 
a = 0.01. 

(e) Comment on the usefulness of the Latin square design in this 
case. 

(f) What is the efficiency of the design compared to the randomized 
block design? 

Damon and Harvey (1987, p. 315) reported data from an experiment 

conducted by Scott Werme of the Department of Veterinary and Ani- 

mal Sciences at the University of Massachusetts to study the effect of 

the level of added okara in a ration of total digestible nutrient (TDN). 

A 4 x 4 Latin square design involving four treatments (levels of added 

okara), four sheep, and four periods was used. The data on the TDN 

levels (in percents) in the total ration are given as follows where the 

letter within parentheses represents the treatment. 


537 


Sheep 
Period 1 2 3 4 
1 61.60 (A) 75.83 (D) 68.17 (C) 65.61 (B) 
2 62.05 (B) 58.63 (A) 70.33 (D) 67.22 (C) 
3 66.91 (C) 68.10 (B) 58.43 (A) 71.98 (D) 
4 69.87 (D) 67.25 (C) 63.32 (B) 60.30 (A) 


Source: Damon and Harvey (1987, p. 315). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
TDN levels in the total ration for all the treatments are the same. 
Use a = 0.05. 

(d) Use Tukey’s multiple comparison procedure to make pairwise 
comparisons among the three top treatments. Use a = 0.01. 

(e) Comment on the usefulness of the Latin square design in this 
case. 

(f) What is the efficiency of the design compared to the randomized 
block design? 

Fisher (1958, pp. 267-268) reported data on root weights for mangolds 

from five different treatments found by Mercer and Hall in 25 plots. 

The following table gives data in a Latin square layout where letters 

(A, B, C, D, E) representing five different treatments are distributed 

randomly in such a way that each appears once in each row and each 

column. 


538 


22. 


The Analysis of Variance 


Column 
Row 1 2 3 4 5 
1 376 (D) 371 (E) 355 (C) 356 (B) 335 (A) 
2 316 (B) 338 (D) 336 (E) 356 (A) 332 (C) 
3 326 (C) 326 (A) 335 (B) 343 (D) 330 (E) 
4 317 (E) 343 (B) 330 (A) 327 (C) 336 (D) 
5 321 (A) 332 (C) 317 (D) 318 (E) 306 (B) 


Source: Fisher (1958, pp. 267-268). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 

experiment. 

Analyze the data and report the analysis of variance table. 

Perform an appropriate F test for the hypothesis that the mean 

root weights for mangolds for all the treatments are the same. 

Use a = 0.05. 

Use Tukey’s multiple comparison procedure to make pairwise 

comparisons among the three treatments with the highest mean 

weight. Use a = 0.01. 

Comment on the usefulness of the Latin square design in this 

case. 

(f) What is the efficiency of the design compared to the randomized 
block design? 

Steel and Torrie (1980, p. 225) reported data from an experiment 

designed to study moisture content of turnip greens. A Latin square 

design involving five plants, five leaf sizes, and five treatments was 

used. Treatments were times of weighing since moisture losses might 

be anticipated in a 70°F laboratory as the experiment progressed. The 

data on moisture content (in percent) are given as follows where the 

letter within parentheses represents a treatment. 


(b) 
(Cc) 


(d) 


(e) 


Leaf size (1 = smallest, 5 = largest) 


Plant 1 2 3 4 5 


1 86.67(E) 87.15(D)  88.29(A) 88.95 (C) 89.62 (B) 
2 85.40 (B) 84.77(E) 85.40(D) = 87.54(A) 86.93 (C) 
3 87.32 (C) 88.53 (B) 88.50(E) 89.99(D) 89.68 (A) 
4 84.92 (A) 85.00(C) 87.29(B) 87.85(E) 87.08 (D) 
5 84.88(D) 86.16(A) 87.83 (C) 85.83 (B) 88.51 (E) 


Source: Steel and Torrie (1980, p. 225). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 


experiment. 


Some Simple Experimental Designs 


23. 


24. 


(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
moisture content for all the treatments are the same. Use a 
0.05. 

(d) Use Tukey’s multiple comparison procedure to make pairwise 
comparisons among the three treatments with the highest mois- 
ture content. Use a = 0.01. 

(ce) Comment on the usefulness of the Latin square design in this 
case. 

(f) What is the efficiency of the design compared to the randomized 
block design? 


539 


An experiment was designed to study the effect of fertilizers on yields 
of wheat. A 4 x 4 Graeco-Latin square design involving four fertil- 
izers, four varieties of wheat, four rows, and four columns was used. 
The data are given as follows where the Roman letter within parenthe- 
ses represents the fertilizer and the Greek letter represents the wheat 


variety. 


Column 
Row 1 2 3 4 
1 135.4 (CB) 124.4(By) 114.6 (D8) 135.4 (Aq) 
2 114.2 (Ba) 105.2 (Cd) 113.6 (Ay) 124.4 (DB) 
3 114.0 (Ad) 116.1 (Da) 134.2 (BB) 119.6 (Cy) 
4 114.9 (Dy) 164.4 (AB) 134.5 (Ca) 118.9 (BS) 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Perform an appropriate F test for the hypothesis that the mean 
yields for all the fertilizers are the same, and state which fertilizer 
you would recommend for use. Use a = 0.05. 

(d) Perform an appropriate test for the hypothesis that the mean 
yields for all the varieties are the same, and state which variety 
you would recommend for use. Use a = 0.05. 

(e) Comment on the usefulness of the Graeco-Latin square design 
in this case. 


An experiment was designed to study the effect of diet on cholesterol. 
A Graeco- Latin square design involving five diets, five time periods, 
five technicians, and five laboratories was used. Subjects were fed the 
diets for different time periods and cholesterol was measured. The 
data are given as follows where the Roman letter within parentheses 


represents the diet and the Greek letter represents the period. 


540 


25. 


The Analysis of Variance 


Laboratory 
Technician 1 2 3 4 5 
1 175.5 (Aq) 165.5 (BB) 168.8 (Cy) 155.5 (Dd) 162.2 (Ee) 
2 157.7 (By) 170.0 (Cd) 167.7 (De) 160.0 (Ea) 170.0 (AB) 
3 170.0 (Ce) 161.5 (Da) 165.5(EB) 174.4(Ay) 162.2 (Bd) 
4 154.4 (DB) 164.4 (Ey) 171.1 (Ad) 163.3 (Be) 166.6 (Ca) 
5 160.0 (ES) 173.3 (Ay) 166.6 (Ba) 166.6 (CB) 163.3 (Dy) 


Describe the mathematical model and the assumptions for the 
experiment. 

Analyze the data and report the analysis of variance table. 
Perform an appropriate F test for the hypothesis that the mean 
cholesterol levels for all the diets are the same. Use a = 0.05. 
Comment on the usefulness of the Graeco-Latin square design 
in this case. 

An experiment was designed to compare the efficiencies of four dif- 
ferent operators using three different machines. A split-plot design 
was used where the output of one machine constitutes a whole-plot, 
which was then divided into four subplots for the four operators. 
The experiment was repeated four times and the data given as 
follows. 


(a) 


(b) 
(c) 


(d) 


Machine 1 Machine 2 Machine 3 


Operator Operator Operator 


1 2 3 4 1 2 3 4 1 2 3 4 


161.0 
160.7 
156.5 
150.5 


143.4 
136.5 
135.3 
142.2 


161.0 
165.6 
160.6 
154.5 


135.1 
140.5 
142.5 
146.0 


167.5 
160.8 
158.7 
158.5 


138.0 
142.7 
132.6 
147.7 


Describe the mathematical model and the assumptions for the 
experiment. 

Analyze the data and report the analysis of variance table. 

Test whether there are significant differences in output of the 
three machines. Use a = 0.05. 

Test whether there are significant differences in output of the 
four operators. Use a = 0.05. 

Is there a significant interaction effect between machines and 
operators? Use a = 0.05. 

Comment on the usefulness of the split-plot design in this case. 


(f) 


Some Simple Experimental Designs 541 


26. Consider a split-plot design involving six treatments A;, A2,..-., A6 


27. 


that are assigned at random to six whole-plots. Each whole-plot is then 
divided into two subplots for testing treatments B, and B2 involving 
three replications. The data on yields are given as follows. 


111) 107) 110) «61130 «102-0 127) 112 123) 125) 120-131-130 
107) 125 #117) 11506 «110 «©1119 «©1210 129) 127) 126-127-123 
118 123 109 #119 116 117) 113° 117) «©1230: 123)0« «122 ~—=«(116 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are significant differences in yields of the 
whole-plot treatments. Use a = 0.05. 

(d) Test whether there are significant differences in yields of the 
subplot treatments. Use a = 0.05. 

(e) Is there a significant interaction effect between whole-plot and 
subplot treatments? Use a = 0.05. 

(f) Comment on the usefulness of the split-plot design in this case. 

John (1971, p. 98) reported data from an experiment described by 

Yates (1937b) carried out at the Rothamsted Experimental Station, 

Harpenden, England, with two factors, varieties of oats and quantity 

of manure. A split-plot design was used with the variety as the whole- 

plot treatment and the manure as the subplot treatment. There were six 

blocks of three plots each and each plot was divided into four subplots. 

One plot in each block was planted with each of the three varieties of 

oats and each subplot was assigned at random to one of the four levels 

of manure. The data on yields are given as follows. 


Block 
1 2 3 
Manure Variety Variety Variety 


Level 
(Tons/Acre) A; Ag Az Ay Az Az Ay Az Az 


Nomanure 111 117 ~~ 105 74 64 70 61 70 96 
0.01 130 114 140 89 103 89 91 108 124 
0.02 157 161 118 81 132 104 97 126 121 
0.03 174 141 156 122 133 117 «100 149 144 


542 


The Analysis of Variance 


Block 
4 5 6 
Manure Variety Variety Variety 


Level 
(Tons/ Acre) Aj A» Ay Ay A? Ay Ay Ay A3 
No manure 62 80 63 68 60 89 53 89 97 


0.01 90 82 70 64 102 129 74 82 99 
0.02 100 94 109 112 89 132 §=118 86 119 
0.03 116 =: 126 99 86 96 124 86113 104121 


Source: John (1971, p. 99). Used with permission. 


(a) Describe the mathematical model and the assumptions for the 
experiment. 

(b) Analyze the data and report the analysis of variance table. 

(c) Test whether there are significant differences in yields of the six 
blocks. Use a = 0.05. 

(d) Test whether there are significant differences in yields of the 
three varieties. Use a = 0.05. 

(e) Test whether there are significant differences in yields of the four 
manures. Use a = 0.05. 

(f) Is there a significant interaction effect between varieties and 
manures? Use a = 0.05. 

(z) Comment on the usefulness of the split-plot design in this case. 


1 Analysis of Variance 
Using Statistical 
Computing Packages 


11.0 PREVIEW 


The widespread availability of modern high speed mainframes and micro- 
computers and myriad accompanying software have made it much simpler to 
perform a wide range of statistical analyses. The use of statistical computing 
packages or software can make it possible even for a relatively inexperienced 
person to utilize computers to perform a statistical analysis. Although there 
are numerous statistical packages that can perform the analysis of variance, 
we have chosen to include for this volume three statistical packages that are 
most widely used by scientists and researchers throughout the world and that 
have become standards in the field.! The packages are the Statistical Analysis 
System (SAS), the Statistical Product and Service Solutions (SPSS),* and the 
Biomedical Programs (BMDP). In the following we provide a brief introduc- 
tion to these packages and their use for performing an analysis of variance, and 
related statistical tests of significance.?:4° 


11.1 ANALYSIS OF VARIANCE USING SAS 


SAS is an integrated system of software products for data management, report 
writing and graphics, business forecasting and decision support, applications 


! A recent survey of faculty regarding their commercial software preference for advanced analysis 
of variance courses found that the most frequently used packages were SAS, SPSS, and BMDP 
(Tabachnick and Fidell (1991)). 

2 The acronym SPSS initially originated from Statistical Package for the Social Sciences. However, 
the term Statistical Product and Solutions really applies to the SPSS acronym as used as the 
company name. The package is simply known as SPSS. 

3 For a discussion of computational algorithms and the construction of computer programs for the 
analysis of variance as used in statistical packages for the analysis of designed experiments, see 
Heiberger (1989). 

4 For a listing of a series of analysis of variance and related programs that can be run on micro- 
computer systems, see Wolach (1983). 

> For some further discussions of the use of SAS, SPSS, and BMDP software in performing 
analysis of variance including numerical examples illustrating various designs, see Colleyer and 
Enns (1987). 


H. Sahai et al., The Analysis of Variance 543 


CY Ria et Gian Se as a mew. Moles: Nektar Vinee OHNO 
© Springer Science+Business Media New York 2000 


544 The Analysis of Variance 


research, and project management. The statistical analysis procedures in the 
SAS system are among the finest available. They range from simple descrip- 
tive statistics to complex multivariate techniques. One of the advantages of SAS 
software is the variety of procedures — from elementary analysis to the most 
sophisticated statistical procedures, of which GLM (General Linear Models) is 
the flagship — that can be performed. SAS software provides tremendous flex- 
ibility and is currently available on thousands of computer facilities throughout 
the world. 

For a discussion of instructions for creating SAS files and running SAS pro- 
cedures, the reader is referred to the SAS manuals and other publications docu- 
menting SAS procedures. The SAS manuals, SAS Language and Procedures: 
Usage (SAS Institute, 1989, 1991), SAS Language: Reference (SAS Institute, 
1990a), SAS Procedures Guide (SAS Institute, 1990b), and SAS/STAT User’s 
Guide (SAS Institute, 1990c), together provide exhaustive coverage of the SAS 
software. The reference and procedure manuals provide an introduction to data 
handling and data management procedures including some descriptive statis- 
tics procedures, and the STAT manual covers inferential statistical procedures. 
The SAS Introductory Guide for PCs (SAS Institute, 1992) is a very useful 
publication for the beginner which provides an elementary discussion of some 
basic and commonly used data management and statistical procedures, includ- 
ing several simple ANOVA designs. Some other related publications providing 
easy-to-use instructions for running SAS procedures together with a broad cov- 
erage of statistical techniques and interpretation of SAS data analysis include 
Dilorio (1991), Friendly (1991), Freund and Littell (1991), Littell et al. (1991), 
Miron (1993), Spector (1993), Aster (1994), Burch and King (1994), Hatcher 
and Stepanski (1994), Herzberg (1994), Jaffe (1994), Elliott (1995), Everitt 
and Derr (1996), Dilorio and Hardy (1996), Cody and Smith (1997), and 
Schlotzhauer and Littell (1997). 

There are several SAS procedures for performing an analysis of variance. 
PROC ANOVA is a very useful procedure that can be used for analyzing a 
wide variety of anova designs known as balanced designs that contain an equal 
number of observations in each submost subcell. However, this procedure is 
somewhat limited in scope and cannot be used for the unbalanced anova designs 
that contain an unequal number of observations in each submost subcell. More- 
over, in a multifactorial experiment, PROC ANOVA is appropriate when all the 
effects are fixed. PROC GLM, which stands for General Linear Models, is more 
general and can accommodate both balanced and unbalanced designs. How- 
ever, the procedure is more complicated, requiring more memory and execution 
time; it runs much slower and 1s considerably more expensive to use than PROC 
ANOVA. The other two SAS procedures which are more appropriate for anova 
models involving random effects include NESTED and VARCOMP. PROC 
NESTED is specially configured for anova designs where all factors are hier- 
archically nested and involve only random effects. The NESTED procedure 1s 
computationally more efficient than GLM for nested designs. Although PROC 


Analysis of Variance Using Statistical Computing Packages 545 


NESTED is written for a completely random effects model, the computations of 
the sums of squares and mean squares are the same for all the models. If some of 
the factors are crossed or any factor is fixed, PROC ANOVA is more appropriate 
for balanced data involving only fixed effects and GLM for balanced or unbal- 
anced data involving random effects. PROC VARCOMP 1s especially designed 
for estimating variance components and currently implements four methods of 
variance components estimation. In addition, SAS has recently introduced a 
new procedure, PROC MIXED, which fits a variety of mixed linear models 
and produces appropriate statistics to enable one to make statistical inference 
about the data. Traditional mixed linear models contain both fixed and random 
effects parameters, and PROC MIXED fits not only the traditional variance 
components models but models containing other covariance structures as well. 
PROC MIXED can be considered as a generalization of the GLM procedure in 
the sense that although PROC GLM fits standard linear models, PROC MIXED 
fits a wider class of mixed linear models. PROC GLM produces all Types I to 
IV tests of fixed effects, but PROC MIXED computes only Type I and Type III. 
Instead of producing traditional analysis of variance estimates, PROC MIXED 
computes REML and ML estimates and optionally computes MIVQUE (0) 
estimates which are similar to analysis of variance estimates. PROC MIXED 
subsumes the VARCOMP procedure except that it does not include the Type I 
method of estimating variance components. For further information and appli- 
cations of PROC MIXED to random and mixed linear models, see Littel et al. 
(1996) and SAS Institute (1997). 


Remark: The output from GLM produces four different kinds of sums of squares, 
labelled as Types I, II, III, and IV, which can be thought of, respectively, as “sequential,” 
“each-after-all-others,” ‘‘X-restrictions” and “hypotheses.” The Type I sum of squares 
of an effect is calculated in a hierarchical or sequential manner by adjusting each term 
only for the terms that precede it in the model. The Type II sum of squares of an effect 
is calculated by adjusting each term by all other terms that do not contain the effect in 
question. If the model contains only the main effects, then each effect is adjusted for 
every other terms in the model. The Type III sum of squares of an effect is calculated 
by adjusting for any other terms that do not contain it and are orthogonal to any effects 
that contain it. The Type IV sum of squares of an effect is designed for situations in 
which there are empty cells and for any effect in the model, if it is not contained in 
any other term, Type IV = Type III = Type II. For balanced designs, all types of sums 
of squares are identical; they add up to the total sum of squares and form an unique 
and orthogonal decomposition. For unbalanced designs with no interaction models, 
Types II, II, and IV sums of squares are the same. Finally, for unbalanced designs 
with no empty cells, Types III and IV are equivalent. Furthermore, output from GLM 
includes a special form of estimable functions for each of the sums of squares listed 
under Types I, II, II], and IV. The importance of the estimable functions is that each 
provides a basis for formulating the associated hypothesis for the corresponding sum 
of squares. For further information on sums of squares from GLM, see Searle (1987, 
Section 12.2). 


546 The Analysis of Variance 


The following is an example of SAS commands necessary to run the ANOVA 
procedure: 


PROC ANOVA; 
CLASS {list of factors}; 
MODEL {dependent variable(s)} = {list of effects}; 


The class statement contains the keyword CLASS followed by a list of factors or 
variables of classification. The model statement contains the keyword MODEL 
followed by the list of dependent variable(s), followed by the equality symbol 
(=) which in turn is followed by the list of factors representing main effects 
as well as interaction effects. Interaction effects between different factors are 
designated by the factor names connected by an asterisk (*). The same com- 
mands are required for executing other analysis of variance procedures as well. 
Let Y designate the dependent variable and consider a factorial experiment in- 
volving three factors A, B, and C. The following is a typical model statement 
containing all the main and two factor interaction effects 


MODEL Y =A B C A*B A*C B*C; 


To analyze a factorial experiment with the full factorial model, a simpler way to 
specify a model statement is to list all the factors separated by a vertical slash. 
For example, the full factorial model involving factors A, B, C, and D can be 
written as 


MODEL ¥ = A| B|C|D: 


In terms of the choice of a fixed, random, or mixed effects model, the pro- 
cedures require that the user supply information about which factors are to be 
considered fixed and which random. In some cases, if appropriate specifica- 
tions for fixed and random effects are not provided, the analysis of variance 
tests provided by the procedure may differ from what the user wants them to 
be. In GLM, one can designate a factor as random by an additional statement 
containing the keyword RANDOM followed by the names of all effects, includ- 
ing interactions, which are to be treated as random. When some of the factors 
are designated as RANDOM, the GLM procedure will perform a mixed effects 
analysis including a default analysis by treating all factors as fixed. For ex- 
ample, given a mixed effects model with A fixed and B random, the following 
commands may be used to execute a GLM program 


PROC GLM; 

CLASS A B; 

MODEL Y =A B A*B,; 
RANDOM B A*B; 


Analysis of Variance Using Statistical Computing Packages 547 


The program will produce expected mean squares using “alternate” mixed 
model analysis, but will use the error mean square for testing the hypothe- 
ses for both the fixed factor A and the random factor B. If the analyst wants to 
use an alternate test, he has an option to designate the hypothesis to be tested 
and the appropriate error term by employing the following command: 


TEST H = {effect to be tested} & = {error term}; 


For example, if the analyst wants to use the interaction mean square for the 
error term for testing the hypothesis for the random factor B, the following 
commands may be used: 


PROC GLM; 
CLASS A B; 
MODEL Y=A B A*B; 
TEST H = B E = A*B; 


If several different hypotheses are to be tested, multiple TEST statements can be 
used. It should be noted that the TEST statement does not override the default 
option involving the test of a hypothesis against the error term which is always 
performed in addition to the hypothesis test implied by the TEST statement. 


Remarks: (i) Given the specifications of the random and fixed effects via the RANDOM 
option, the GLM will print out the coefficients of the expected mean squares. The 
researcher can then determine how to make any test using expected mean squares. When 
exact tests do not exist, PROC GLM can still be used followed by the Satterthwaite’s 
procedure. 

(ii) The expected means squares and tests of significance are computed based on 
‘alternate’ mixed model theory without the assumption that fixed-by-random interactions 
sum to zero across levels of the fixed factor. Thus, the tests based on RANDOM option 
will differ from those based on ‘standard’ mixed model theory. 

(iii) The RANDOM statement provides for the following two options which can be 
placed at the end of the statement following a slash (/): 

Q: This option provides a complete listing of all quadratic forms in the fixed effects 
that appear in the expected mean square. 

TEST: This option provides that an F test be performed for each effect specified in 
the model using an appropriate error term as determined by the expected mean squares 
under the assumptions of the random or mixed model. When more than one mean square 
must be combined to determine the appropriate error term, a pseudo-F test based on 
Satterthwaite’s approximation is performed. 

(iv) For balanced designs, the ANOVA procedure can be used instead of GLM. 
However, the ANOVA performs only the fixed effects analysis and does not allow the 
option of the RANDOM statement. 


For using the NESTED procedure, no specifications for MODEL or RAN- 
DOM statements are required since the procedure is designed for hierarchically 


548 The Analysis of Variance 


nested designs involving only random factors. The order of nesting of the factors 
is indicated by the CLASS statement. In addition, a VAR statement is needed 
to specify the dependent variable. The following commands will execute the 
NESTED procedure where the factor B is nested within factor A: 


PROC NESTED; 
CLASS A B; 
VAR Y; 


The program involving the preceding statements will perform two tests of hy- 
potheses; one for the factor A effects against the mean square for B and the 
other for the factor B effects against the error mean square. The GLM procedure 
can also be used for analyzing designs involving nested factors. The nesting 
is indicated in the MODEL statement by the use of parentheses containing the 
factor within which the factor preceding the left parenthesis 1s being nested. 
For example, to indicate that the factor B is nested within the factor A, the 
following MODEL statement could be used: 


MODEL Y = A B(A); 


If the factor B is random, one could perform the appropriate F test by using 
the RANDOM statement in the following SAS commands: 


PROC GLM; 

CLASS A B; 

MODEL Y = A B(A); 
RANDOM B(A)/ TEST; 


Alternatively, one could replace the preceding RANDOM statement by the 
following TEST option: 


TEST H=A E = B(A),; | 
Multiple nesting involving several factors, say, A, B,C, and D, where D is 
nested within C, C within B, and B within A, is indicated by the following 
MODEL statement: 
MODEL Y = A B(A) C(BA) D(CBA),; 


If factors C and D are crossed rather than nested, the above MODEL statement 
will be modified as: 


MODEL Y = AB(A) C*D(BA); 


Analysis of Variance Using Statistical Computing Packages 549 


For the problems involving estimation of variance components, PROC VAR- 
COMP is a better choice over other analysis of variance procedures. It produces 
estimates of variance components assuming all factors are random. For per- 
forming a mixed model analysis, the user must indicate the factors that are to be 
treated as random. All the SAS commands involving the CLASS and MODEL 
statements are used the same way as in other procedures. For example, the fol- 
lowing commands can be used to estimate the variance components associated 
with the random factors A, B, and C and their interaction effects. 


PROC VARCOMP; 
CLASS A BC; 
MODEL Y = A| B|C; 


It should be pointed out that the VARCOMP does not perform F tests. However, 
it does produce sums of squares and mean squares that can be readily used to 
perform the required F tests manually. 

PROC MIXED employs the same specifications for the CLASS statement 
as GLM but differs in the specifications of the MODEL statement. The right 
side of the MODEL statement now contains only the fixed-effect factors. The 
random-effect factors do not appear in the MODEL statement instead they 
are listed under the RANDOM statement. Thus, the MODEL and RANDOM 
statements are core essential statements 1n the application of the PROC MIXED 
procedure. Given two factors A and B with A fixed and B random, the following 
commands can be used to run the procedure: 


PROC MIXED; 
CLASS A B; 
MODEL Y = A; 
RANDOM B A*B; 


The program will perform significance tests for fixed effects (in this case for fac- 
tor A) and will compute REML estimates of the variance components including 
the error variance component. 


Remark: PROC MIXED computes variance components estimates for random effect 
factors listed in the RANDOM statement including the error variance component; and 
performs significance tests for fixed effects specified in the MODEL statement. By 
default, the significance tests are based on likelihood principle and are equivalent to the 
conventional F tests for balanced data (default option). Variance components estimates 
are computed using the restricted maximum likelihood (REML) procedure (default 
option). For balanced data sets with nonnegative estimates, the REML estimates are 
identical to the traditional anova estimates. However, for designs with unbalanced 
structure, REML estimates generally differ from the anova estimates. Furthermore, 
unlike the ANOVA and GLM procedures, PROC MIXED does not directly compute or 
print sums of squares. Instead, it shows REML estimates of variance components and 


550 The Analysis of Variance 


prints a separate table of tests of fixed effects that contains results of significance tests 
for the fixed-effect factors specified in the MODEL statement. 


11.2 ANALYSIS OF VARIANCE USING SPSS 


Statistical Package for the Social Sciences was initially developed by N. H. Nie 
and his coworkers at the National Opinion Research Center at the University of 
Chicago. It was officially shortened to SPSS when the first version of SPSS* 
was released in 1983. In Release 4.0 of the new product it was simply known 
as SPSS (the X was dropped). Current versions are SPSS for Windows, the 
most recent of which is now Release 9.0. This package 1s an integrated system 
of computer programs originally developed for the analysis of social sciences 
data. The package provides great flexibility in data formation, data transfor- 
mation, and manipulation of files. Some of the procedures currently available 
include descriptive analysis, simple and partial correlations, one-way and mul- 
tiway analysis of variance, linear and nonlinear regression, loglinear analysis, 
reliability and life tables, and a variety of multivariate methods. 

For a discussion of instructions for preparing SPSS files and running a SPSS 
procedure, the reader is referred to the SPSS manuals and other publications 
documenting SPSS procedures. For applications involving SPSS for Windows, 
SPSS Base 7.5 for Windows User’s Guide (SPSS Inc., 1997a) provides the 
most comprehensive and complete coverage of data management, graphics, 
and basic statistical procedures. The other two manuals: SPSS Professional 
Statistics 7.5 (SPSS Inc., 1997b) and SPSS Advanced Statistics (SPSS Inc., 
1997c) are also very useful publications for a broad coverage and documentation 
of intermediate and advanced level statistical procedures and interpretation 
of SPSS data analysis. Some other related publications providing easy-to-use 
instructions for running SPSS programs include Crisler (1991), Hedderson 
(1991), Frude (1993), Hedderson and Fisher (1993), Lurigio et al. (1995), and 
Coakes and Steed (1997). The monograph by Levine (1991) is a very useful 
guide to SPSS for performing analysis of variance and other related procedures. 

There are several SPSS procedures for performing analysis of variance. The 
simplest procedure is ONEWAY for performing one-way analysis of variance.°® 
The other programs will also do the one-way analysis of variance, but ONEWAY 
has a number of attractive features which make it a very useful procedure. In 
addition to providing the standard analysis of variance table, the ONEWAY will 
produce the following statistics for each group: number of cases, minimum, 
maximum, mean, standard deviation, and standard error, and 95 percent con- 
fidence interval for the mean. The ONEWAY also provides for testing linear, 
quadratic and polynomial relations across means of ordered groups as well as 
user specified a prior contrasts (coefficient for means) to test for specific linear 
relations among the means (e.g., comparing means of two treatment groups 


® The MEANS procedure is even more basic than ONEWAY, although it does require few extra 
mouse clicks or a subcommand keyword to obtain analysis of variance results. 


Analysis of Variance Using Statistical Computing Packages 551 


against that of a control group). Additional features include a number of multi- 
ple comparison tests including more than a dozen methods for testing pairwise 
differences in means and 10 multiple range tests for identifying subsets of means 
that are not different from each other. 

Three other SPSS procedures that are of common use for analyzing higher 
level designs include ANOVA, MANOVA, and GLM. The ANOVA is used for 
performing analysis of variance for a factorial design involving two or more 
factors. For a model with five or fewer factors, the default ANOVA option 
provides for a full factorial analysis including all the interaction terms up to 
order five. The user has the option to control the number of interaction terms to 
be included in the model and any interaction effects that are not computed are 
pooled into the residual or error sum of squares. However, the user control over 
interactions in ANOVA is limited to specifying a maximum order of interactions 
to include. This means that all interactions at and below that level are included. 
For example, in a design with factors A, B, and C, the model that ANOVA will 
fit (assuming no empty cells) with all three factors used are: main effects only, 
main effects and all three two-way interactions and the full factorial model. The 
user does not have an option of fitting a model such as A, B, C, A*B where 
some but not all of the interactions of a particular model are included. 

The MANOVA and GLM are probably the most versatile and complex of 
all the SPSS procedures and can accommodate both balanced and unbalanced 
designs including nested or nonfactorial designs, multivariate data and analyses 
involving random and mixed effects models. Both procedures are based on a 
general linear model program and can allow multiple design subcommands. 
However, the GLM only honors the last DESIGN subcommand it encounters. 
The main difference between the two procedures in terms of statistical design 1s 
that while the MANOVA uses the full-rank reparametrization, the GLM uses the 
generalized inverse approach to accommodate a non-full rank overparametrized 
model. In SPSS 7.5 version, the MANOVA is available only through syntax 
commands, while the GLM is available both in syntax and via dialog boxes. 
In addition, the GLM offers a variety of features unavailable in MANOVA 
(see, e.g., SPSS Inc., 1997c, pp. 345-346). For example, the GLM tests for 
univariate homogeneity of variance assumption using the Levene test and pro- 
vides for a number of different multiple comparison tests for unadjusted one- 
way factor means while these options are unavailable in MANOVA. 


Remark: In Release 8.0, ANOVA is available via command syntax only; and in its 
place a new procedure UNIANOVA for performing univariate analysis of variance has 
been introduced. UNIANOVA is simply a univariate version of the GLM procedure 
restricted primarily for handling designs with one dependent variable. 


In aSPSS analysis of variance procedure, all the dependent and independent 
variables are specified by using the keyword denoting the name of the procedure 
followed by the listing of the dependent variable first which is separated from 
the independent variables or factors using the keyword BY. The levels of a factor 


552 The Analysis of Variance 


are specified by the use of parentheses that give the minimum and maximum 
numeric value of the levels of the given factor. For example, if the factor A has 
three levels coded as 1, 2, and 3 and B has four levels coded as 1, 2, 3, and 
4, and Y denotes the dependent variable then a one-factor analysis of variance 
using ONEWAY and a full factorial analysis using ANOVA and MANOVA 
procedures can be performed using the following commands: 


ONEWAY Y BY A(1, 3) 
ANOVA Y BY A(I, 3) B(1, 4) 
MANOVA ¥ BY A(1, 3) BC, 4). 


Analysis of variance involving more than two factors can similarly be performed 
using either ANOVA or MANOVA procedures. For example, with three factors 
A, B, and C, each with three levels, a full factorial analysis can be performed 
using the following statements: 


ANOVA Y BY A(1, 3) BC, 3) C(I, 3) 
MANOVA Y BY A(1, 3) B(1, 3) Cd, 3). 


For four factors A, B, C, and D, with A having 2 levels, B 3 levels, C 4 levels, 
and D 5 levels, the statements are: 


ANOVA ¥ BY A(1, 2) B(, 3) CC, 4) DA, 5) 
MANOVA Y BY A(1, 2) B(J, 3) C1, 4) DC, 5). 


The syntax for GLM does not require the use of code levels for the factors 
appearing in the model and is simply written as: 


GLM Y BY AB 
GLM Y BY ABC 
GLM Y BY ABCD, 
etc. 


Remark: In Release 7.5 (and 8.0), ONEWAY does not require range specifications and 
does not honor them if they are included. 


The preceding commands for ANOVA, MANOVA, and GLM procedures 
without any further qualifications will assume a full factorial model. In 
MANOVA and GLM, a factorial model could be made more explicit by an 
additional statement separated from the MANOVA/GLM statement by a slash 
(/), and consisting of the keyword DESIGN followed by the symbol “=” which 
in turn is followed by a listing of all the factors including interactions. The 
interaction A x B between the two factors A and B is indicated by the use of 
the keyword BY connecting the two factors (i.e., A BY B). In GLM, one can 


Analysis of Variance Using Statistical Computing Packages 553 


either use the keyword BY or the asterisk (*) to join the factors involved in the 
interaction term. For example, in the MANOVA/GLM command given previ- 
ously, the user can explicitly indicate the full factorial model by the following 
commands: 


MANOVA Y BY A(1, 2) BC, 3) 
/DESIGN = A, B,A BY B. 
GLM Y BY AB 

/DESIGN = A, B, A*B. 


If the model is not a full factorial, the DESIGN statement simply lists the effects 
to be estimated and tested. For example, with three factors A, B, and C, amodel 
containing main effects and the interactions A x B and B x C can be specified 
by the following MANOVA commands: 


MANOVA Y BY A(1, 2) BC, 3) CC, 4) 
/DESIGN = A,B,C, ABY B, BBY C. 


ANOVA does not have an option for a design subcommand, and models other 
than the full factorial are specified by using the MAXORDERS subcommand 
to suppress interactions below a certain level. 

MANOVA and GLM procedures also allow the use of nested models for 
nesting one effect within another. In MANOVA, nested models are indicated 
by connecting the two factors being nested by the keyword WITHIN (or just 
W). For example, for a two-factor nested design with factor B nested within 
factor A and the response units nested within factor B, the following commands 
are used for the MANOVA procedure: 


MANOVA Y BY A(1, 3) BCI, 3) 
/DESIGN A, B WITHIN A. 


In GLM, one can either use the keyword WITHIN or a pair of parentheses to 
indicate the desired nesting. Thus, the nesting in the above example is indicated 
by 


GLM Y BY AB 
/DESIGN A, B(A). 


Multiple nesting involving several factors 1s specified by the repeated use of the 
keyword WITHIN. In GLM, one can also use more than one pair of parentheses 
where each pair must be enclosed or nested within another pair. For example, to 
specify a three-factor completely nested design where B is nested within A and 
C is nested within B, the following commands are used for the MANOVA/GLM 


554 The Analysis of Variance 


procedures: 


MANOVA Y BY A(1, 3) B(1, 4) CC, 4) 

/DESIGN A, B WITHIN A, C WITHIN B WITHIN A. 
GLM Y BY ABC 

(DESIGN A, B(A), C(B(A)). 


Remarks: (i) The DESIGN subcommand in MANOVA cannot be simply written as 
/DESIGN B WITHIN 4; otherwise it would produce nonsensical sums of squares with 
unbalanced data and an incorrect term and degrees of freedom in balanced or unbalanced 
designs. MANOVA’s reparameterization method in general requires that hierarchies be 
maintained when listing effects on the DESIGN subcommand. That is, /DESIGN B 
WITHIN A without A generally makes no sense, and /DESIGN A*B without both A 
and B in the model also makes no sense. Neither term estimates what the user wants it 
to estimate unless hierarchy is maintained. 

(ii) In GLM, the subcommand /DESIGN B(A) will fit the same model as /DESIGN 
A, B, A*B, or /DESIGN A, B(A), or /DESIGN B, A(B), or /DESIGN A, A*B, or 
/DESIGN B, A*B, or /DESIGN A*B, but the interpretation of the parameter estimates 
will differ in various cases. GLM doesn’t reparametrize, so if one leaves out contained 
effects while including containing ones, some of the parameters for the containing 
effects that would normally be redundant no longer are, and the degrees of freedom and 
interpretation of the fitted effects are altered. In order to fit the standard nested model, 
the subcommand should be specified as /DESIGN A, B(A). 


For the analysis of variance models containing random factors, the MANOVA 
or GLM procedures must be used. In MANOVA, special F tests involving a 
random and mixed model analysis are performed by specifying denominator 
mean squares via the use of the keyword VS within the design statement. All the 
denominator mean squares, other than the error mean square (which is referred 
to as WITHIN), must be named by a numeric code from 1 to 10. One can then 
test against the number assigned to the error term. It is generally convenient to 
test against the error term before defining it as long as it is defined on the same 
design subcommand. For example, in a factorial analysis involving the random 
factors A and B where the main effects for the factors A and B are to be tested 
against the A x B mean square and the A x B effects are to be tested against 
the error mean square, the following commands are required: 


MANOVA Y BY A(1, 2) BCI, 3) 
/DESIGN = A VS 1 

BVS 1 

A BY B = 1 VS WITHIN. 


In the design statement given above, the first two lines specify that factors A 
and B are to be tested against the error term 1. The third line specifies that the 
error term | is defined as the A x B interaction, which in turn is to be tested 


Analysis of Variance Using Statistical Computing Packages 555 


against the error mean square (defined by the keyword WITHIN). Suppose, in 
the preceding, example, A is fixed, B is a random factor, and A 1s to be tested 
against A x B interaction as error term 1 and B is to be tested against the usual 
error mean square. To perform such a mixed model analysis, we would use the 
following command sequence: 


MANOVA Y BY A(1, 2) B(J, 3) 
/DESIGN = A VS 1 
A BY B = 1 VS WITHIN 
B VS WITHIN. 


Similarly, consider a two-factor nested design where a random factor B is nested 
within a fixed factor A. Now, the following commands will execute appropriate 
tests of the effects due to A and B(A) factors: 


MANOVA Y BY A(1, 2) B(, 3) 
/DESIGN = A VS 1 
B WITHIN A = 1 VS WITHIN. 


The first line in the design statement specifies that the factor A effect is to be 
tested against the error term 1. The second line specifies the error term 1 as 
the mean square due to the B(A) effect, and further specifies that this effect is 
to be tested against the usual error mean square (designated by the keyword 
WITHIN). For a three-factor completely nested design where a random factor 
C is nested within a random factor B which in turn is nested within a random 
factor A, the following commands are required to perform appropriate tests of 
the effects due to A, B(A), and C(B): 


MANOVA Y BY A(1, 2) BC, 3) CC, 4) 
/DESIGN = A VS 1 
B WITHIN A = 1 VS 2 
C WITHIN B WITHIN A = 2 VS WITHIN. 


In the above example, A is tested against the error term number 1 which is de- 
fined as B WITHIN A mean square; B is tested against the error term number 2 
which is defined as the C WITHIN B WITHIN A mean square; and C is tested 
against the usual error mean square. 


Remarks: (i) The default analysis without VS specification assumes that all factors 
are fixed. The use of the keyword VS within the design statement is the main device 
available for tailoring of the analysis involving random and mixed effects models. 

(ii) The assignment of error terms via VS specification are appropriate only for bal- 
anced designs. For unbalanced designs, psuedo- F tests based on Satterthwaite procedure 
will generally be required. 


556 The Analysis of Variance 


In GLM, the random or mixed effects analysis of variance is performed by 
a subcommand containing the keyword RANDOM followed by names of all 
the factors which are to be treated as random. If a factor A is specified as a 
random effect, then all the two-factor and higher order interaction effects con- 
taining the specified effect are automatically treated as random effects. When 
the RANDOM subcommand is used, the appropriate error terms for testing 
hypotheses concerning all the effects in the model are determined automati- 
cally. When more than one mean squares must be combined to determine the 
appropriate error term, a pseudo-F test based on the Satterthwaite procedure 
is performed. Several random effects can be specified on a single RANDOM 
subcommand; or one may use more than one RANDOM subcommand which 
have an accumulated effect. For example, to perform a two-factor random ef- 
fects analysis of variance involving factors A and B, the following statements 
are required: 


GLM Y BY AB 
/DESIGN A, B, A*B 
/RANDOM A, B. 


In the above example, the hypothesis testing for each effect will be automatically 
carried out against the appropriate error term. Thus, the main effects A and B 
are tested against the A x B interaction which in turn 1s tested against the usual 
error term. Suppose, in the example above, that A is fixed and B is random. 
To perform a mixed model analysis, we would use the following sequence of 
statements: 


GLM Y BY AB 
/DESIGN A, B, A*B 
/RANDOM B. 


In the above example, the effects B and A x B are treated as random effects. A 
and B are tested against A x B interaction while A* B 1s tested against the usual 
error term. In addition, GLM also allows the option of a user specified error term 
to test ahypothesis via the use of asubcommand TEST. To use this subcommand, 
the user must specify both the hypothesis term and the error term separated by 
the keyword VS. The hypothesis term must be a valid effect specified or implied 
in the DESIGN subcommand and must precede the keyword VS. The error term 
can be anumerical value or a linear combination of valid effects. A coefficient in 
a linear combination can be a real number or a fraction. Ifa value is specified for 
the error term, one must specify the number of degrees of freedom following the 
keyword DF. The degrees of freedom must be a positive real number. Multiple 
TEST subcommands are allowed and are executed independently. Thus, in the 
two-factor mixed model example considered above, suppose an alternate mixed 
model analysis is performed where both A and B main effects are to be tested 
against the A x B interaction. Now, the following statements are required to 


Analysis of Variance Using Statistical Computing Packages 557 


perform the appropriate test via the TEST subcommand: 


GLM Y BY AB 
/DESIGN A, B, A*B 
/RANDOM B 
[TEST A VS A*B 
[TEST B VS A*B. 


Further, suppose that B effect is to be pooled with A x B term and A is to be 
tested against the pooled mean square. To achieve this, the following syntax is 
required: 


GLM Y BY AB 
/DESIGN A, B, A*B 
/TEST A VS B + A*B. 


In addition to ONEWAY, ANOVA, MANOVA, and GLM procedures, SPSS 
7.5 (and 8.0) incorporates a new procedure, VARCOMP, especially designed 
to estimate variance components in a random or mixed effects analysis of vari- 
ance model. It can be used through syntax commands or via dialog boxes. 
Similar to GLM, the random factors are specified by the use of a RANDOM 
subcommand. There must be at least one RANDOM subcommand with one 
random factor; several random factors can be specified on a single RANDOM 
statement or one can use multiple RANDOM subcommands which have a cu- 
mulative effect. Four methods of estimation are available in the VARCOMP 
procedure which can be specified via the use of a METHOD subcommand. 
For example, in a two-factor mixed model with A fixed and B random, the 
following syntax will estimate the variance components using the maximum 
likelihood method: 


VARCOMP Y BY AB 
/DESIGN A, B, A*B 
/RANDOM B 
/METHOD = ML. 


Remarks: (i) Four methods of estimation in the VARCOMP procedure are: anal- 
ysis of variance (ANOVA), maximum likelihood (ML), restricted maximum likeli- 
hood (REML), and minimum norm quadratic unbiased estimator (MINQUE). The 
ANOVA, ML, REML, and MINQUE methods of estimation are specified by the key- 
words SSTYPE (n) (n = 1 or 3), REML, ML, and MINQUE (n) (n = 0 or 1) respec- 
tively on the METHOD subcommand. The default method of estimation is MINQUE(1). 
MINQUE(1) assigns unit weight to both the random effects and the error term while 
MINQUE (0) assigns zero weight to the random effects and unit weight to the error 
term. The ANOVA method uses Type I and Type III sums of squares designated by the 


558 The Analysis of Variance 


keywords SSTYPE(1) and SSTYPE(3) respectively; the latter being the default option 
for this method. 

(ii) When using ML and REML methods of estimation, the user can specify the 
numerical tolerance for checking singularity, convergence criterion for checking relative 
change in the objective function, and the maximum number of iterations by use of the 
following keywords on the CRITERIA subcommand: (a) EPS(n) — epsilon value used 
in checking singularity, n > O and the default is 1. OE-8; (b) CONVERGE(n) — 
convergence value, n > O and the default is 1.0 E-8; (c) ITERATE(n) — value of the 
number of iterations to be performed, n must be a positive interger and the default 1s 50. 

(ii1) The user can control the display of optional output by the following keywords on 
the PRINT subcommand: (a) EMS — When using SSTYPE(n) on the METHOD sub- 
command, this option prints expected mean squares of all the effects; (b) HISTORY (7) 
— When using ML or REML on the METHOD subcomman4d, this option prints a table 
containing the value of the objective function and variance component estimates at every 
n-th iteration. The value of n is a positive integer and the default is 1; (c) SS — When 
using SSTYPE(n) on the METHOD subcommand, this option prints a table containing 
sums of squares, degrees of freedom, and mean squares for each source of variation. 


11.3. ANALYSIS OF VARIANCE USING BMDP’ 


BMDP Programs are successors to BMD (biomedical) computer programs de- 
veloped under the direction of W. J. Dixon at the Health Service Computing 
Facility of the Medical Center of the University of California during early 
1960s. Since 1975 the BMDP series has virtually replaced the BMD package. 
The BMDP series provides the user with more flexible descriptive language, 
newly available statistical procedures, powerful computing algorithms, and the 
capability of performing repeated analyses from the same data file. The BMDP 
programs are arranged in six categories: data description, contingency table, 
multivariate methods, regression, analysis of variance, and special programs. 
Some of the procedures covered in the BMDP series are: contingency tables, 
regression analysis, nonparametric methods, robust estimators, the analysis of 
repeated measures, and graphical output which includes histograms, bivariate 
plots, normal probability plots, residual plots, and factor loading plots. 

For a discussion of instructions for creating BMDP files and running BMDP 
programs, the reader is referred to BMDP manuals (Dixon 1992). In addition, 
a comprehensive volume, Applied Statistics: A Handbook of BMDP Analyses 
by Snell (1987), provides easy instructions to enable the student to use BMDP 
programs to analyze statistical data. The main BMDP program for performing 
analysis of variance is the BMDP 2V. It is a flexible general purpose program 
for analysis of variance, and can handle both balanced and unbalanced designs. 
It performs analysis of variance or covariance for a wide variety of fixed effects 
models. It can accommodate any number of grouping factors including repeated 


7 BMDP software is now owned and distributed by SPSS, Inc., Chicago, Illinois. 


Analysis of Variance Using Statistical Computing Packages 559 


measures, which, however, must be crossed (not nested). The program can also 
distinguish between group factors from repeated measures factors. In addition, 
there are several other programs including 7D, 3V, and 8V that can be used to 
perform an analysis of variance. 7D performs one- and two-way fixed effects 
analysis of variance; and its output includes descriptive statistics and side-by- 
side histograms for all the groups, and other diagnostics for a thorough data 
screening including a separate tally of missing or out-of-range values. In addi- 
tion, it performs Welch and Brown-Forsythe tests for homogeneity of variances 
and an analysis of variance based on trimmed means including confidence in- 
tervals for each group. 3V performs an analysis of variance for a general mixed 
model including balanced or unbalanced designs. It uses maximum and re- 
stricted maximum likelihood approaches to estimate the fixed effects and the 
variance components of the model. The output includes descriptive statistics for 
the dependent variable and for any covariate(s). It also provides for a number 
of other optional outputs including parameter estimates for specified hypothe- 
ses and the likelihood ratio tests with degrees of freedom and p-values. The 
program 8V can perform an analysis of variance for a general mixed model hav- 
ing balanced data. It can handle crossed, nested, and partially nested designs 
involving either fixed, random, or mixed effects models. The output from 8V 
includes an analysis of variance table with columns for expected mean squares 
defined in terms of variance components, the F' values, including overall mean, 
cell means, and corresponding standard deviations, and estimates of variance 
components. 

Generally, using the simplest possible program adequate for analyzing a 
given anova design is recommended. The programs 7D and 2V are commonly 
used for the fixed effects model whereas 3V and 8V can be used for the random 
and mixed effects models. For designs involving balanced data, 8V is recom- 
mended since it is simpler to use. For designs with unbalanced data, 3V should 
be used. For random and mixed effects models, in addition to performing 
standard analysis of variance, the program 3V also provides variance compo- 
nents estimates using maximum likelihood and restricted maximum likelihood 
procedures. In addition, BMDP has a number of other programs, namely, 3D, 
1V, 4V, and 5V, which can be employed for certain special types of anova 
designs. 3D performs two-group tests (with equal or unequal group variances, 
paired or independent groups, Levene’s test for the equality of group variances, 
trimmed ¢ test, and the nonparametric forms of the tests including Spearman 
correlation and Wilcoxon’s signed-rank and rank-sum tests) and 1V performs 
one-way analysis of covariance to test the equality of adjusted group means. 4V 
performs univariate and multivariate analysis of variance and covariance for a 
wide variety of models including repeated measures, split-plot, and cross-over 
designs. The output includes, among other things, a summary of descriptive 
Statistics, multivariate statistics, and analysis of covariance. Finally, 5V ana- 
lyzes repeated measures data for a wide variety of models including those with 
unequal variances, covariances with specified patterns, and missing data. 


560 The Analysis of Variance 


11.4 USE OF STATISTICAL PACKAGES 
FOR COMPUTING POWER 


Consider an analysis of variance F test of a fixed factor effect to be performed 
against the critical value F[v, v2; 1 —a@]. Let A be the noncentrality parameter 
under the alternative hypothesis. The following SAS command can be used to 
calculate the required power: 


PR = 1 — PROBF (FC, NU1, NU2, LAMBDA), 


where PR, FC, NU1, NU2, LAMBDA are simply SAS names chosen to denote 
power, F[v;, v2; 1 — @], v1, v2, and A, respectively. For an F test involving a 
random factor, where the distribution of the test statistic under the alternative 
hypothesis depends only on the (central) F distribution, the power can be 
calculated using the following command, 


PR = 1 — PROBE (FC, KAPA, NU1, NU2), . 


where KAPA is the proportionality factor determined as a function of the design 
parameters and the values of the variance components under the alternative 
hypothesis. 

The power calculations should generally be performed for different values 
of vj, v2, A, and KAPA. This would be a useful exercise for investigating the | 
range of power values possible for different values of the design parameters 
and factor size effect to be detected under the alternative. These results could 
then be used to set the parameters of the given analysis of variance design so 
as to provide adequate power in order to detect a factor effect of a given size. 

In SPSS MANOVA, exact and approximate power values can be comptued 
via the POWER subcommand. If the POWER 1s specified by itself without 
any keyword, MANOVA calculates approximate power values of all F tests at 
0.05 significant level. The following keywords are available on the POWER 
subcommand: 


APPROXIMATE — This option calculates approximate power values which 
are generally accurate to three decimal places and are much more economical 
to calculate than the exact values. This is the default option if no keyword is 
used. 

EXACT — This option calculates exact power values using the noncentral 
incomplete beta distribution. 

F(a) — This option permits the specification of the alpha value at which the 
power is to be calculated. The value of alpha must be a number between 0 
and 1, exclusive. The default alpha value is 0.05. 


In SPSS GLM, the observed power for each F test at 0.05 significance level is 
displayed by default. An alpha value other than 0.05 can be specified by the 


Analysis of Variance Using Statistical Computing Packages 561 


use of the keyword ALPHA(n) in the CRITERIA subcommand. The value of 
n must be between OQ and 1, exclusive. 


Remarks: (i) In SPSS Release 8.0, the observed power for each F test is no longer 
printed by default. Instead checkboxes in the Options dialog box are available for effect 
size estimates and observed power, and corresponding keywords ETASQ and OPOWER 
in the PRINT subcommand. 

(ii) SPSS also has inverse distribution functions and noncentral t, x7, and F distri- 
butions which can be used for power analysis in the same manner as in SAS. 


11.5 USE OF STATISTICAL PACKAGES FOR MULTIPLE 
COMPARISON PROCEDURES 


Most multiple comparison tests are more readily performed manually using the 
results on means, sums of squares, and mean squares provided by a computer 
analysis. The hand computations are quite simple and the use of a computing 
program does not necessarily save any time. However, some of the more recent 
methods are more suited to a computer analysis. For all three computing pack- 
ages considered here, the programs designed to perform f¢ tests do not allow 
the use of a custom-made error mean square (EMS). Some programs in some 
packages that allow the use of a ¢ test on a contrast also do not permit the 
use of a custom-made EMS. SAS provides the user the option of specifying a 
custom-made EMS whereas the error term employed in SPSS ONEWAY and 
BMDP is fixed. Most statistical packages will perform all possible pairwise 
comparisons as the default when the option to perform multiple comparisons is 
requested. More general comparisons are also available, but may require some 
restrictions in terms of which procedures may be used. 

SAS can perform pairwise comparisons using either the PROC ANOVA or 
the GLM. There are 16 different multiple comparison procedures, including 
Tukey, Scheffé, LSD, Bonferroni, Newman-Keuls, and Duncan, which can be 
readily implemented using the following statement for both the procedures: 


MEANS {name of independent variable}/{SAS name of the multiple 
comparison to be used} 


For example, in order to perform Duncan’s multiple comparison on a balanced 
one-way experimental layout using PROC ANOVA, the following commands 
may be used: 


PROC ANOVA; 
CLASS A; 
MODEL Y = A; 


MEANS A/ DUNCAN; 


Both ANOVA and GLM procedures allow the use of as many multiple compar- 
isons as the user may want simply by listing their SAS names and separating the 


562 The Analysis of Variance 


names with blanks. The name of the independent variable in the MEANS state- 
ment must be the name chosen for the independent variable when the data file 
is created. For testing more general contrasts than pairwise comparisons, the 
sum of squares for a contrast can be obtained by a CONTRAST command. For 
example, the contrast C1 with four coefficients can be tested by the statements: 


CLASS A; 

MODEL Y = A; 

CONTRAST ‘C1’ 
AAS): 


Any number of orthogonal contrasts can be tested by including more lines in 
the CONTRAST command, for example, 


CONTRAST ‘C1’ 


Al1-—1-—1; 
CONTRAST ‘C2’ 

Al—-1l1-1; 
CONTRAST ‘C3’ 

Al-—1-—-11; 


Remark: In Release 6.12, the LSMEANS statements in PROC GLM and PROC MIXED 
with ADJUST options offer several different multiple comparison procedures. In addi- 
tion, PROC MULTTEST can input a set of p-values and adjust them for multiplicity. 


SPSS can perform multiple comparison tests by using the ONEWAY and 
GLM procedures. In ONEWAY, multiple comparisons are implemented by 
the use of RANGES and POSTHOC subcommands. The RANGES subcom- 
mand allows for the following seven tests specified by the respective keywords: 
Least Significant Difference (LSD), Bonferroni (BONFERRONI), Dunn-Sidak 
(SIDAK), Tukey (TUKEY), Scheffé (SCHEFFE), Newman-Keuls (SNK) and 
Duncan (DUNCAN). (LSDMOD and MODLSD are acceptable for the Bon- 
ferroni procedure.) The RANGES subcommand provides an option for several 
tests in the same run by including more lines in the RANGES subcommand, e.g., 


/RANGES = LSD 
/RANGES = TUKEY 


etc. 


Remark: The LSD test does not maintain a single overall a-level. However, Bonferroni 
and Dunn-Sidak procedures can be performed by adjusting the overall a-level according 
to the number of desired comparisons and then using the LSD for the modified a-level. 
The default a-level for the LSD is 0.05. If some other a-level is desired, it is specified 
within parentheses immediately following the keyword LSD in the RANGE statement, 


Analysis of Variance Using Statistical Computing Packages 563 


for example, 
/RANGES = LSD (0.0167). 


The POSTHOC subcommand offers options for twenty different multiple com- 
parisons including those that test the pairwise differences among means and 
those that identify homogenous subsets of means that are not different from each 
other. The latter tests are commonly known as multiple range tests. Among dif- 
ferent types of multiple comparisons available via POSTHOC subcommand are: 
Bonferroni, Sidak, Tukey’s honestly significant difference, Hochberg’s GT2, 
Gabriel, Dunnett, Ryan-Einot-Gabriel-Welsch F test (R-E-G-W F), Ryan- 
Einot-Gabriel-Welsch range test (R-E-G-W Q), Tamhane’s T2, Dunnett’s T3, 
Games-Howell, Dunnett’s C, Duncan’s multiple range test, Student-Newman- 
Keuls (S-N-K), Tukey’s b, Waller-Duncan, Scheffé, and least-significant differ- 
ence. One can use as many of these tests in one run as one wants, either using 
one POSTHOC subcommand, or a stack of them. 


Remark: Tukey’s honestly significant difference test, Hochberg’s GT2, Gabriel’s test, 
and Scheffé’s test are multiple comparison tests and range tests. Other available range 
tests are Tukey’s b, S-N-K (Student-Newman-Keuls), Duncan, R-E-G-W F (Ryan- 
Einot-Gabriel-Welsch F test), R-E-G-W Q (Ryan-Einot-Gabriel-Welsch range test), 
and Waller-Duncan. Among available multiple comparison tests are Bonferroni, Tukey’s 
honestly significant difference test, Siddk, Gabriel, Hochberg, Dunnett, Scheffé, and 
LSD (least significant difference). Multiple comparison tests that do not assume equal 
variances are Tamhane’s T2, Dunnett’s T3, Games-Howell, and Dunnett’s C. 


For performing more general contrasts than the pairwise comparisons be- 
tween the means, the subcommand RANGES is replaced by CONTRAST. The 
coefficients defining the contrast are specified immediately after the keyword 
CONTRAST and separated from it by an equality sign (=) in between. The 
coefficients can be separated by spaces or commas. For example, the contrast 
with four coefficients 1, 1, —1, —1 1s tested by the statement 


[CONTRAST = 1 1-1-1 


or 
/CONTRAST = 1, 1, -1, —1. 


Similar to pairwise multiple comparison procedures, a number of contrasts 
can be selected by including more lines in the CONTRAST subcommand; for 
example, 


ICONTRAST = 11-1 -1 
/ICONTRAST = 1 —1 1-1 
/CONTRAST = 1 —-1—-11 


etc. 


564 The Analysis of Variance 


Also, one can use both RANGES and CONTRAST subcommands following 
the same ONEWAY command. However, it is necessary that all the RANGES 
subcommands be placed consecutively followed by all the CONTRAST sub- 
commands. The CONTRAST subcommand previously outlined is equivalent 
to performing a tf test and is appropriate for a single a priori comparison of in- 
terest. If the researcher is interested in many simultaneous comparisons, which 
include complex contrasts as well as pairwise, a multiple comparison procedure 
based on general contrasts needs to be employed. Although Scheffé’s method 
can be accessed by the SPSS, it can be used only for pairwise comparisons. For 
more complex contrasts, however, one can use the CONTRAST statement to 
get all the quantities needed for the Scheffé’s comparison and then perform the 
computations manually. 


Remarks: (i) One can list fractional coefficients in defining a contrast and thereby 
indicating that averages are in fact being tested; however, they are not required. 

(ii) Comparisons of means that do not define a contrast (do not add up to zero) can 
be tested using the CONTRAST subcommand. SPSS will analyze a “‘noncontrast” but 
will flag a warning message that a noncontrast is being tested. 


All the tests available in ONEWAY are also available in GLM (and UNI- 
ANOVA in Release 8.0) for performing multiple comparisons between the 
means of a factor for the dependent variable which are produced via the use of 
POSTHOC subcommand. The user can specify one or more effects to be tested 
which, however, must be a fixed main effect appearing or implied in the design 
subcommand. The value of type I error can be specified using the keyword AL- 
PHA on the CRITERIA subcommand. The default alpha value is 0.05 and the 
default confidence level is 0.95. GLM also allows the use of an optional error 
term which can be defined following an effect to be tested by using the keyword 
VS after the test specification. The error term can be any single effect that is not 
the intercept or a factor tested on POSTHOC subcommand. Thus, it can be an 
interaction effect, and it can be fixed or random. Furthermore, GLM allows the 
use of multiple POSTHOC subcommands which are executed independently. 
This way the user can test different effects against different error terms. The 
output for test used for pairwise comparisons includes the difference between 
each pair of compared means, the confidence interval for the difference, and 
the significance. 

GLM also allows tests for contrasts via the CONTRAST subcommand. The 
name of the factor is specified within a parenthesis following the subcommand 
CONTRAST. After enclosing the factor name within the parenthesis, one must 
enter an equal sign followed by one of the CONTRAST keywords. In addition to 
the user defined contrasts available via the keyword SPECIAL, several different 
types of contrasts including Helmert and polynomial contrasts are also avail- 
able. Although one can specify only one factor per CONTRAST subcommand, 
multiple contrast subcommands within the same design are allowed. Values 
specified after the keyword SPECIAL are stored in a matrix in row order. For 


Analysis of Variance Using Statistical Computing Packages 565 


example, if factor A has three levels, then CONTRAST (A) = SPECIAL (1 1 
1 1 —100 1 —1) produces the following contrast matrix: 


It 
1 — 0 
0 —| 


Suppose the factor A has three levels and Y designates the dependent variable 
being analyzed. The following example illustrates the use of a polynomial 
contrast: 


GLM Y BY A 
/CONTRAST (A) = POLYNOMIAL (1, 2, 4). 


The specified contrast indicates that the three levels of A are actually in the 
proportion 1:2:4. For illustration and examples of other contrast types, see 
SPSS Advanced Statistics 7.5 (SPSS Inc., 1997c, APPENDIX A, pp. 535- 
540). 

Similar to the SPSS, BMDP performs multiple comparisons by using the one- 
way analysis of variance program. BMDP 7D contains a number of multiple 
comparison procedures including Tukey, Scheffé, LSD, Bonferroni, Newman- 
Keuls, Duncan, and Dunnett. However, the reported level of significance for 
comparisons are for one-sided tests and should be multiplied by two to obtain 
the overall significance level associated with a family of two-sided tests. BMDP 
7D can perform all pairwise comparisons by using the following statement: 


/COMPARISON {BMDP name of multiple comparison to be used}. 


The program provides the option of several multiple comparison procedures 
that may be selected by including more procedure names consecutively; for 
example, | 


ICOMPARISON TUKEY 
SCHEFFE 
~ BONFERRONI 
NK 
DUNCAN. 


11.6 USE OF STATISTICAL PACKAGES 
FOR TESTS OF HOMOSCEDASTICITY 


All three packages (SAS, SPSS, and BMDP) provide several tests of ho- 
moscedasticity or homogeneity of variances. The MEANS statement in SAS 


566 The Analysis of Variance 


GLM procedure now includes various options for testing homogeneity of vari- 
ances in a one-way model and for computing Welch’s variance-weighted one- 
way analysis of variance test for differences between groups when the group 
variances are unequal. The user can choose between Bartlett, Levene, Brown- 
Forsythe, and O’ Brien’s tests by specifying the following options in the MEANS 
statement: 


HOVTEST = BARTLETT 

HOVTEST = LEVENE < (TYPE = ABS/SQUARE) > 
HOVTEST = BF 

HOVTEST = OBRIEN < (W = number) > 


If no test is specified in the HOVTEST option, by default Levene’s test (type = 
square) is performed. Welch’s analysis of variance is requested via option 
WELCH in the MEANS statement. For a simple one-way model with depen- 
dent variable Y and single factor classification A, the following code illustrates 
the use of HOVTEST (default) and WELCH options in the MEANS statement 
for the GLM procedure: 


PROC GLM; 

CLASS = A; 

MODEL Y = A; 

MEANS A/HOVTEST WELCH; 
RUN; 


For further information and applications of homogeneity of variance tests in 
SAS GLM, see SAS Institute (1997, pp. 356-359). 

SPSS ONEWAY provides an option to calculate Levene statistic for homo- 
geneity of group variances. In MANOVA procedure the user can request the 
option HOMOGENEITY as a keyword in its PRINT subcommand. HOMO- 
GENEITY performs tests for the homogeneity of variance of the dependent 
varaible across the cells of the design. One or more of the following specifi- 
cations can be included in the parentheses after the keyword HOMOGENEITY: 


BARTLETT — This option performs and displays Bartlett-Box F test. 

COCHRAN — This option performs and displays Cochran’s C test. 

ALL — This option performs and displays both Bartlett-Box F test and 
Cochran’s C test. This is the default option if HOMOGENEITY is requested 
without further specifications. 


SPSS GLM procedure also performs Levene’s test for equality of variances 
for the dependent variable across all cells formed by combinations of between 
subject factors. However, the Levene test in GLM (and UNIANOVA in Release 
8.0) uses the residuals from whatever model is fitted and will differ from the 


Analysis of Variance Using Statistical Computing Packages 567 


description given above if the model is not a full factorial, or if there are 
covariates. The test can be requested using the HOMOGENEITY subcommand. 
BMDP 7D performs Levene’s test of homogeneity of variances among the cell 
when performing one- and two-way analysis of variance for fixed effects. 


11.7. USE OF STATISTICAL PACKAGES 
FOR TESTS OF NORMALITY 


Tests of skewness and kurtosis can be performed using BMDP and SPSS since 
they provide standard errors of both statistics. SAS UNIVARIATE implements 
the Shapiro-Wilk W test when the sample is 2000 or less and a modified 
Kolmogorov-Smirnov (K-S) test when the sample size is greater than 2000. 
Also, SPSS EXAMINE procedure offers the K-S Lilliefors and Shapiro-Wilk 
tests, and NPAR TESTS offers a one-sample K-S test without the Lilliefors 
correction. The K-S tests in both cases are performed for sample sizes which 
are sufficiently large, while the Shapiro-Wilk test is given when sample size is 
50 or less. 


Appendices 


A STUDENT'S t DISTRIBUTION 
2 


If the random variable X has a normal distribution with mean y and variance o“, 
we denote this by X ~ N(, 07). Let X;, X2,..., X, be arandom sample from 
the N(, 07) distribution. Then it is known from the central limit theorem that 
X = )-7_, Xi/n ~ N(w, 0*/n). Applying the technique of standardization to 


X, we have 
X-—p 


mae, 


~ N(O, 1). 


If o is unknown, then it is usually replaced by S = ,/>-7_,(Xi — X)* j (n — 1) 
and the statistic Z becomes (X — 1)/(S/./n). Now, we no longer have a stan- 
dard normal variate but a new variable whose distribution is known as the 
Student’s ¢ distribution. 

Before 1908, this statistic was treated as an approximate Z variate in large 
sample experiments. William S. Gosset, a statistician at the Guinness Brewery 
in Dublin, Ireland, empirically studied the distribution of the statistic (X — 2)/ 
(S/./n) for small samples, when the sampling was from a normal distribution 
with mean p and variance a”. Since the company did not allow the publication 
of research by its scientists, he published his findings under the pseudonym 
“Student.” In 1923, Ronald A. Fisher theoretically derived the distribution of 
the statistic (X — )/(S/./n) and since then it has come to be known as the 
Student’s ¢ distribution. The distribution is completely determined by a single 
parameter v = n — 1, known as the degrees of freedom. The distribution has 
the same general form as the distribution of Z, in that they both are symmetric 
about a mean of zero. Both distributions are bell-shaped, but the ¢ distribution 
is more variable. In large samples, the ¢ distribution is well approximated by 
the standard normal distribution. It is only for small samples that the distinction 
between the two distributions becomes important. 

We use the symbol t[v] to denote a ¢ random variable with v degrees of 
freedom. A 100(1 — a@)th percentile of the ¢t[v] random variable is denoted by 
t[v, 1 — a] and is the point on the t[v] curve for which 


P{t{v] <t[v,l1—a]}=1-—-a. 


Useful tables of percentiles of the ¢ distribution are given by Hald (1952), Qwen 
(1962), Fisher and Yates (1963), and Pearson and Hartley (1970). Appendix 


569 


570 The Analysis of Variance 


Table III gives certain selected values of percentiles of t[v]. The mean and 
variance of a f[v] variable are: 


E(t({u]) = 0 
and : 
Var(t[ uJ) = ——,  v>2. 
v—2 
B CHI-SQUARE DISTRIBUTION 


If Z;, Z2,..., Z, are independently distributed random variables and each Z; 
has the N(O, 1) distribution, then the random variable 


V= Sz? 
r=1 


is said to have a chi-square distribution with v degrees of freedom. Note that if 
X,, X2,..., X, are independently distributed and X; has the N(,4;, a7) distri- 
bution, then the random variable 


9 (X; — wi)? 
2 
fA 0; 


also has the chi-square distribution with v degrees of freedom. 

We use the symbol x’[v] to denote a chi-square random variable with v 
degrees of freedom. The chi-square random variable has the reproductive pro- 
perty; that is, the sum of two chi-square random variables is again a chi-square 
random variable. More specifically, if V; and V2 are independent random vari- 
ables, where 


Vi ~ x?ty] 
and 
Vo ~ x*[v], 
then 
Vi t+ V2 ~ x*[vy + vy]. 
The chi-square distribution was developed by Karl Pearson in 1900. A 100(1 — 
a)th percentile of the y*{v] random variable is denoted by x’[v, 1 — a] and is 


the point on the x?[v] curve for which 


P{x*[v] < x*Lv, 1 -a]} = 1 —a. 


Appendices 571 


Useful tables of percentiles of chi-square distributions are given by Hald and 
Sinkbaek (1950), Vanderbeck and Cook (1961), Harter (1964a,b), and Pearson 
and Hartley (1970). Appendix Table IV gives certain selected values of per- 
centiles of x?[v]. The mean and variance of a x?[v] variable are: 


E(x*[v]) = v 
and 
Var(x7[v]) = 2v. 


C SAMPLING DISTRIBUTION OF (n— 1)S?2/o? 


Let X,, X2,..., X, bearandom sample from the N (2, 0”) distribution. Define 
the sample mean X and variance S? as 


n 


Xj 
xX _ i=1 
n 
and 
y pars 
(xi — XY 
52 Fe [=1 


The sampling distribution of the statistic (n — 1)S*/o7 is of particular interest 
in many Statistical problems and we consider its distribution here. 

By the addition and subtraction of the sample mean X, it is easy to see 
that 


(KG — HY = DOIG — X) + (X — wr 
i=] j 


i=] 


=) (Ki; — XP + DK — py? + 2X — w) (Ki — X) 
i=l i=l ak 

=) (X%; — XP +0(X - p). 
i=] 


Dividing each term on both sides of the equality by o* and substituting (n — 1)S? 
for )~)_,(X; — X)’, we obtain 


ly 2 _ (n= 1S? | (X-wy 
aD ca eae oe 


572 The Analysis of Variance 


Now, from Appendix B, it follows that )-7_, (X; — 4)?/o? is a chi-square 
random variable with n degrees of freedom. The second term on the right- 
hand side of the equality is the square of a standard normal variate, since 
X is a normal random variable with mean yz and variance o*/n. Therefore, 
the quantity (X — 4)*/(o*/n) has a chi-square distribution with one degree of 
_ freedom. Using advanced statistical techniques, one can also show that the two 
chi-square variables, (n — 1)S*/o7 and (X — jt)” /(o7 /n), are independent (see, 
e.g., Hogg and Craig, (1995, pp. 214—217)). Thus, from the reproductive prop- 
erty of the chi-square variable, it follows that (n — 1)S*/o7 has a chi-square 
distribution with n — 1 degrees of freedom. 


DF DISTRIBUTION 


If V; and V> are two independent random variables, where V; ~ x?[v,] and 
V. ~ x7[v2], then the random variable 


V/V 
V2/v2 


is said to have an F distribution with v, and v2 degrees of freedom, respectively. 
We use the symbol F[v,, v2] to denote an F variable with v, and v2 degrees of 
freedom. 

Suppose a random sample of size n; 1s drawn from the N({1, a7) and an 
independent random sample of size n2 is drawn from the N({2, 07) distribution. 
If S? and S3 are the corresponding sample variances, then, from Appendix C, 
it follows that 


(n; — 1)S? 
1 ~ xn, — 1] 
07 
and 
(ny — 1)S3 
- 9) : ii x7[n2 a 1] 
> 


Therefore, the quotient 


St/o? — x7lm — N/m — 1) 


~ F o—: = 
S3/oz X7[n2 — 11 /(n2 - 1) [nie lta 1] 


Fisher (1924) considered the theoretical distribution of 5 log,(S 7 / S3) known 
as the Z distribution. The distribution of the variance ratio F was derived by 
Snedecor (1934), who showed that the distribution of F was simply a trans- 
formation of Fisher’s Z distribution. Snedecor named the distribution of the 
variance ratio F in honor of Fisher. The distribution has subsequently come to 
be known as Snedecor’s F distribution. | 


Appendices 573 


Note that the number of degrees of freedom associated with the chi-square 
random variable appearing in the numerator of F is always stated first, followed 
by the number of degrees of freedom associated with the chi-square random 
variable appearing in the denominator. Thus, the curve of the F distribution 
depends not only on the two parameters v, and v2, but also on the order in 
which they occur. 

A 100(1 — @)th percentile value of the F[v,, v2] variable is denoted by 
Fv, v2; 1 — a] and is that point on the F[v,, v2] curve for which 


P{F[vy, v2] = Flv, v2; 1 —a]} =1-a. 


Useful tables of percentiles of the F distribution are given by Hald (1952), 
Fisher and Yates (1963), and Pearson and Hartley (1970). Mardia and Zemroch 
(1978) compiled tables of F distributions that include fractional values of vy 
and v2. The fractional values of the degrees of freedom are useful when an 
F distribution is used as an approximation. Appendix Table V gives certain 
selected values of percentiles of F[v,, v2] for various sets of values of v, and 
v2. The mean and variance of an F[v,, v2] variable are: 


E(F[v1, »]) = 5 pe 
ee 


and 


2v3(v1 + v2 — 2) 


Var(F[v1, v2]) = ne) ee 


v2 > 4. 


E NONCENTRAL CHI-SQUARE DISTRIBUTION 


If X,, X2,..., X, are independently distributed random variables and each X; 
has the N(,z;, 1) distribution, then the random variable 


V= Sx? 
i=l 


is said to have a noncentral chi-square distribution with v degrees of freedom. 
The quantity A = ()07_, yu2)2 is known as the noncentrality parameter of the 
distribution. We use the symbol x2[v, A] to denote a noncentral chi-square 
random variable with v degrees of freedom and the noncentrality parameter A. 

Note that the ordinary or central chi-square distribution is the special case 
of the noncentral distribution when the nonncentrality parameter 4 = 0. In 
the statistical literature the noncentrality parameter is sometimes defined dif- 
ferently. Some authors use 4 = )°;_, 47 whereas others use A = 4 )-)_, 1}, 
both using the same symbol 4. The noncentral chi-square variable, like the 
central chi-square, possesses the reproductive property; that is, the sum of 


574 The Analysis of Variance 


two noncentral chi-square variables is again a noncentral chi-square variable. 
More specifically, if V, and V, are independent random variables, such 
that 


Vi ~ x7 [A] 
and 

Vo ~ x7 [vy, Ad], 
then 


Vit+V2.~ x7 [1 + v2, A, + Az]. 


The noncentral chi-square distribution can be approximated in terms of a 
central chi-square distribution. A detailed summary of various approximations 
including a great deal of other information can be found in Tiku (1985a) and 
Johnson et al. (1995, Chapter 29). Tables of the noncentral chi-square distribu- 
tion have been prepared by Hayman et al. (1973). The mean and variance of a 
x2 [v, A] variable are: 


E(x? [v, A]) =v +a 
and 


Var(x2 [v, A]) = 2(v + 2A). 


F NONCENTRAL AND DOUBLY NONCENTRAL 
t DISTRIBUTIONS 


If U and V are two independent random variables, where U ~ N[6, 1] and 
V ~ x’[v], then the random variable 


U/VV/v 


is said to have a noncentral ¢ distribution with v degrees of freedom and the 
noncentrality parameter 5. We use the symbol t’[v, 5] to denote a noncentral 
t distribution with v degrees of freedom and the noncentrality parameter 4. 
The distribution is useful in evaluating the power of the ¢ test. There are many 
approximations of the noncentral ¢ distribution in terms of normal and (cen- 
tral) ¢ distributions. A detailed summary of various approximations and other 
results can be found in Johnson et al. (1995, Chapter 31). A great deal of 
other information about the noncentral ¢ distribution 1s given in Owen (1968, 
1985). Tables of the noncentral ¢ distribution have been prepared by Resnikoff 
and Lieberman (1957) and Bagui (1993). The mean and variance of a t’[v, 6] 


Appendices 575 


variable are: 


E(t'[v, 6]) = 


and 


6)""()s] 


/ v 2 
Var(t'[v, 6]) = oer + 6°) — 


If V is a noncentral chi-square distribution with the noncentrality parameter 
A, then 


U/J/V/v 


has a doubly noncentral ¢ distribution with noncentrality parameters 6 and A, 
respectively. Tables of the doubly noncentral ¢ distribution are given by Bulgren 
(1974). Further information including analytic expressions for the distribution 
function and some computational aspects can be found in Johnson et al. (1995, 
pp. 533-537) and references cited therein. 


G NONCENTRAL F DISTRIBUTION 


If V; and V2 are two independent random variables, where V; ~ x7 [vy , A] and 
V2 ~ x?[v2], then the random variable 


Vi /vy 
V2/v2 


is said to have a noncentral F distribution with v, and v) degrees of freedom 
and the noncentrality parameter 2. We use the symbol F'’[v;, v2; A] to denote a 
noncentral F variable with v; and v2 degrees of freedom and the noncentrality 
parameter 1. 

It is sometimes useful to approximate a noncentral F' distribution in terms 
of a (central) F distribution. A detailed summary of various approximations 
including a great deal of other information can be found in Tiku (1985b) and 
Johnson et al. (1995, Chapter 30). Comprehensive tables of the noncentral 
F distribution were prepared by Tiku (1967, 1972). Tables and charts of the 
noncentral F distribution are discussed in Section 2.17. The mean and variance 


576 The Analysis of Variance 


of an F’[v), v2; A] variable are: 


V2(v; + A) 


E(F'[vy, v23A]) = =D) 


v2 >2 


and 


Var(F'[v1, v2;A]) = v2(v2 — 2)2(v, — 4) 


, >4. 
H DOUBLY NONCENTRAL F DISTRIBUTION 


If V; and V2 are two independent random variables, where V; ~ x7 [vy , A] and 
V> ~ x2 [vo, Az], then the random variable 


Vi /Vy 
V2/v2 


is said to have adoubly noncentral F distribution. We use the symbol F’”’[v}, v2; 
i1,A2] to denote a doubly noncentral F variable with v; and v2 degrees of 
freedom and noncentrality parameters A, and A2. Thus, the doubly noncentral 
F distribution is the ratio of two independent variables, each distributed as a 
noncentral chi-square, divided by their respective degrees of freedom. Tables 
of the doubly noncentral F distribution were given by Tiku (1974). Further 
discussions and details about the distribution including applications in contexts 
other than analysis of variance can be found in Tiku (1974) and Johnson et al. 
(1995, pp. 499-502). 

The doubly noncentral F distribution is related to the doubly noncentral beta 
in the same way as the (central) beta to the (central) F. It is the distribution of 
Vi /(V + V2). 


| STUDENTIZED RANGE DISTRIBUTION 


Suppose that (X;, X2,..., X,) 1s arandom sample from a normal distribution 
with mean yp and variance o*. Suppose further that S* is an unbiased estimate 
of o* based upon v degrees of freedom. Then the ratio 


q[p, v] = {max(X;) — min(X;)}/S 


is called the Studentized range, where the arguments in the square bracket 
indicate that the distribution of g depends on p and v. In general, the Studentized 
range arises as the ratio of the range of a sample of size p from a standard normal 
population to the square root of an independent x *[v]/v variable with v degrees 
of freedom. In analysis of variance applications, normal samples are usually 
means of independent samples of the same size, and the denominator is an 
independent estimate of their common standard error. The sampling distribution 


Appendices 577 


of g[p, v] has been tabulated by various workers and good tables are available 
in Harter (1960), Owen (1962), Pearson and Hartley (1970), and Miller (1981). 
Perhaps the most comprehensive table of percentiles of the Studentized range 
distribution is Table B2 given in Harter (1969a), which has p = 2 (1) 20 (2) 40 
(10) 160; v = 1 (1) 20, 24, 30, 40, 60, 120, oo; and upper-tail a = 0.001, 0.005, 
0.01, 0.025, 0.05, 0.1 (0.1) 0.9, 0.95, 0.975, 0.99, 0.995, 0.999. Some selected 
percentiles of g[p, v] are given in Appendix Table X. The table is fairly easy 
to use. Let g[p, v; 1 — a] denote the 100(1 — a@)th percentile of the g[p, v] 
variable. Suppose p = 10 and v = 20. The 90th percentile of the studentized 
range distribution is then given by 


g{10; 20; 0.90] = 4.51. 


Thus, with 10 normal observations from a normal population, the probability 
is 0.90 that their range is not more than 4.51 times as great as an independent 
sample standard deviation based on 20 degrees of freedom. 


J) STUDENTIZED MAXIMUM MODULUS DISTRIBUTION 


The Studentized maximum modulus is the maximum absolute value of a set 
of independent unit normal variates which is then Studentized by the standard 
deviation. Thus, let X;, X2,..., Xp, be a random sample from the N(u, ao”) 
distribution. Then the Studentized maximum modulus statistic is defined by 


max |X; — X| 
m[p, v] = ae 


where X = )-?_, X;/pand S? = )~?_,(X;-X)’/(p—1). For the case where S* 
represents an independent estimate of o? such that vS?/o7 has a chi-square dis- 
tribution, the distribution was first derived and tabulated by Nair (1948). The 
critical points for the studentized maximum modulus distribution can also be 
obtained by taking the square roots of the entries in the tables of the Studentized 
largest chi-square distribution with one degree of freedom for the numerator as 
given by Armitage and Krishnaiah (1964). Pillai and Ramachandran (1954) gave 
a table for a = 0.05; p = 1(1)8; and v = 5(5)20, 24, 30, 40, 60, 120, oo. Dunn 
and Massey (1965) provided a table for a = 0.01, 0.025, 0.05, 0.10(0.1)0.50; 
p =2, 6, 10, 20; and v = 4, 10, 30, oo. Hahn and Hendrickson (1971) give gen- 
eral tables of percentiles of m[p, v] fora = 0.01, 0.05, 0.10; p =(1)(1)6(2)12, 
15, 20; and v = 3(1)12, 15, 20, 25, 30, 40, 60, which also appear in Miller 
(1981, pp. 277-278). Stoline and Ury (1979) and Ury et al. (1980) give spe- 
cial tables with p=a(a — 1)/2 for a=2(1)20 and v = 2(2)50(5)80(10)100, 
respectively. Bechhoefer and Dunnett (1982) have provided tables of the dis- 
tribution of m[p, v] for p =2(1) 32; v =2(1)12(2)20, 24, 30, 40, 60, oo; and 
a = 0.10, 0.05, and 0.01. These tables are abridged versions of more extensive 
tables given by Bechhoefer and Dunnett (1981). Hochberg and Tamhane (1987) 


578 The Analysis of Variance 


also give tables with p = a(a — 1)/2 for a= 2(1)16(2) 20 and v = 2(1) 30(5) 
50(10) 60(20) 120, 200, co. Some selected percentiles of m[p, v] are given in 
Appendix Table XV. 


K SATTERTHWAITE PROCEDURE AND ITS APPLICATION 
TO ANALYSIS OF VARIANCE 


Many analysis of variance applications involve a linear combination of mean 
squares. Let S? (i = 1,2..., p) be p mean squares such that v; S / of has achi- 
square distribution with v; degrees of freedom. Consider the linear combination 
Se 4 4:8: where £;’s are known constants. Satterthwaite (1946) proce- 
dure states that vS?/o? is distributed approximately as a chi-square distribution 
with v degrees of freedom where o* = E(S’) and v is determined by 


Satterthwaite procedure is frequently employed for constructing confidence 
intervals for the mean and the variance components in a random and mixed 
effects analysis of variance. For example, if a variance component 07 is esti- 
mated by S?= )77_, £; S?, then an approximate 100(1 — w)% confidence inter- 
val for o? is given by 


vS? 3 vS? 
——— Ss SO ee 
x° Ly, a/2] x*L, l —a/2] 


where x*[v, a@/2] and x2[v, 1 — a/2] are the 100(@/2)th lower and upper per- 
centiles of the chi-square distribution with v degrees of freedom and v is deter- 
mined by the formula given previously. 

Another application of the Satterthwaite procedure involves the construction 
of a psuedo-F test when an exact F test cannot be found from the ratio of 
two mean squares. In such cases, one can form linear combinations of mean 
squares for the numerator, for the denominator, or for both the numerator and the 
denominator such that their expected values are equal under the null hypothesis. 
For example, let 


MS’ = £,S7 +--+ + £82 
and 


MS” = 0,52 +--+ 28%, 


Appendices 579 


where the mean squares are chosen such that E(MS’) = E(MS”") under the null 
hypothesis that a particular variance component is zero. Now, an approximate 
F test of the null hypothesis can be obtained by the statistic 


_ MS' 
a MS”’ 


which has an approximate F distribution with v’ and v” degrees of freedom 
determined by 


, (6S2 +o + b582)" 
= £2.54 /v, + +++ + £254 /v, 


and 


1 (€.S; aeaeas bss?) 
SS Sea 
£2. $4 /y, +... + £284 /v, 


In many situations, it may not be necessary to approximate both the numerator 
and the denominator mean squares for an approximate F' test. However, when 
both the numerator and the denominator mean squares are constructed, it is al- 
ways possible to find additive combinations of mean squares, and thereby avoid 
subtracting mean squares which may result in a poor approximation. For some 
further discussions of psuedo-F tests, see Anderson (1960) and Eisen (1966). 

In many applications of the Satterthwaite procedure, some of the mean 
squares may involve negative coefficients. Satterthwaite remarked that care 
should be exercised in applying the approximation when some of the coeffi- 
cients may be negative. When negative coefficients are involved, one can rewrite 
the linear combination as S$? = S4—5S3 , where S4 contains all the mean squares 
with positive coefficients and S% with negative coefficients. Now, the degrees 
of freedom associated with the approximate chi-square distribution of S* are 
determined by 


f = (S84 —S3)°/(S4/ fat Sb/ fe), 


where f,4 and fg are the degrees of freedom associated with the approximate 
chi-square distributions of 57, and S?, respectively. Gaylor and Hopper (1969) 
showed that Satterthwaite approximation for S* with f degrees of freedom is 
an adequate one when 


Si / Sz > Fife, fa,0.975] x Fifa, fp;0.5] 


if f4 < 100 and fz > f4/2. The approximation is usually adequate for the dif- 
ferences of mean squares when the mean squares being subtracted are relatively 


580 The Analysis of Variance 


small. Khuri (1995) gives a necessary and sufficient condition for the Satterth- 
waite approximation to be exact in balanced mixed models. 


L COMPONENTS OF VARIANCE 


In discussing Models II and III, we have introduced variances corresponding 
to the random effects terms in the analysis of variance model. These have been 
designated “components of variance” since they represent the parts of the total 
variation that can be ascribed to these sources. The variance components are 
associated with random effects and appear in both random and mixed models. 

Variance components were first employed by Fisher (1918) in connection 
with genetic research on Mendelian laws of inheritance. They have been widely 
used in evaluating the precision of instruments and, in general, are useful in 
determining the variables that contribute most to the variability of the process 
or the different sources contributing to the variation in an observation. This 
permits corrective actions that can be taken to reduce the effects of these vari- 
ables. Another use has been described by Cameron (1951), who used variance 
components in evaluating the precision of estimating the clean content of wool. 
Kussmaul and Anderson (1967) describe the application of variance compo- 
nents for analyzing composite samples, which are obtained by pooling data 
from individual samples. 

There is a large body of literature related to variance components, which 
cover results on hypothesis testing, point estimation, and confidence intervals, 
and fairly complete bibliographies are provided by Sahai (1979), Sahai et al. 
(1985), and Singhal et al. (1988). Additional works of interest include survey 
papers by Crump (1951), Searle (1971a, 1995), Khuri and Sahai (1985), and 
Burdick and Graybill (1988), including texts and monographs by Rao and Kleffe 
(1988), Burdick and Graybill (1992), Searle et al. (1992), and Rao (1997). 


M_ INTRACLASS CORRELATION 


In the random effects model 
Vg = Op eis: TH Np 2c Oy J = Ay eset, 


jis considered to be a fixed constant and a@;’s and the e;;’s are independently dis- 
tributed random variables with mean zero and variances o2 and a, respectively. 
Thus, as a part of the model, 


E(yij) = E(u) + E(@;) + Ei;) 
=u+0+0 
= 4 


Appendices 581 


and 
Var(yi;) — Var({L) + Var(q; ) + Var(eé;;) 
=0+02+0; 


ee 2 
=O, +90,. 


The covariance structure of the model may be represented as follows: 


0, ifi 4 i’ 
Cov(yij, WI= Oe tog, iis, jas’ 
a2, ifi=i’, jx j’. 


The intraclass correlation is then defined by 


Cov(yij, vij’) o2 


p SS SS Se 
J Var(yij),/Var(yij) oF +03 


Thus, p is the correlation between the pair of individuals belonging to the same 
class and has the range of values from —1/(n — 1)to 1. The intraclass correlation 
was first introduced by Fisher (1918) as a measure of the correlation between 
the members of the same family, group, or class. It can be interpreted as the 
proportion of the total variability due to the differences in all possible treatment 
groups of this type. The intraclass correlation coefficient 1s a parameter that 
has been studied classically in statistics (see, e.g., Kendall and Stuart (1961, 
pp. 302-—304)). It has found extensive applications in several different fields 
of study including use as a measure of the degree of familial resemblance 
with respect to biological and environmental characteristics. It also plays an 
important role in reliability theory involving observations on a sample of various 
judges or raters, and in sensitivity analysis where it has been used to measure 
the efficacy of an experimental treatment. For a review of inference procedures 
for the intraclass correlation coefficient in the one-way random effects model, 
see Donner (1986). 


N ANALYSIS OF COVARIANCE 


The analysis of covariance 1s a combination of analysis of variance and regres- 
sion. In analysis of variance, all the factors being studied are treated qualitatively 
and in analysis of regression all the factors are treated quantitatively. In ana- 
lysis of covariance, some factors are treated qualitatively and some are treated 
quantitatively. The term independent variable often refers to a factor treated 
quantitatively in analysis of covariance and regression. The term covariate or 
concomitant variable is also used to denote an independent variable 1n an ana- 
lysis of covariance. The analysis of covariance involves adjusting the observed 


582 The Analysis of Variance 


value of the response or dependent variable for the linear effect of the con- 
comitant variable. If such an adjustment for the effects of the concomitant 
variable is not made, the estimate of the error mean square would be inflated 
which would make the analysis of variance test less sensitive. The adjustment 
or elimination of the linear effect of the concomitant variable generally results 
in a small mean square. The analysis of covariance uses regression analysis 
techniques for elimination of the linear effect of the concomitant variable. The 
technique was originally introduced by Fisher (1932) and Cochran (1957) who 
presented a detailed account of the subject. Analysis of covariance techniques, 
however, are generally complicated and are often considered to be one of the 
most misunderstood and misused statistical techniques commonly employed 
by researchers. A readable account of the subject is given in Snedecor and 
Cochran (1989, Chapter 18) and Winer et al. (1991, Chapter 10). For a more 
mathematical treatment of the topic, see Scheffé (1959, Chapter 6). An ex- 
tended expository review of the analysis of covariance is contained in a set of 
seven papers which appeared in a special issue of Biometrics (Vol. 13, No. 3, 
1957). A subsequent issue of the same journal (Vol. 38, No. 3, 1982) includes 
discussion of complex designs and nonlinear models. Further discussions on 
analysis of covariance can be found in a series of papers appearing in a special 
issue Of Communications in Statistics: Part A, Theory and Methods (Vol. 8, 
No. 8, 1979). For a book-length account of the subject, see Huitema (1980). 


O EQUIVALENCE OF THE ANOVA F AND 
TWO-SAMPLE t TESTS 


In this appendix, it is shown that in a one-way classification, the analysis of 
variance F test is equivalent to the two-sample ¢ test. Consider a one-way 
classification with a groups and let the i-th group contain n; observations with 
N = )>j_, nj. Let y;; be the j-th observation from the i-th group (i = 1, 2,..., 
a;j =1,2,...,n;). The F statistic for testing the hypothesis of the equality 
of treatment means 1s defined by 


_ MSs _ SSg/dfz 
~ MSw  SSw/dfw’ 


where 


SSz3 = So nil — 7), 
i=l 
SSw =) > 04-51). 
i=1 j=l 
dfg=a-—1 and dfw=WN-a. 


Appendices 


For the case of two groups (1.e., a = 2), it follows that 


SSp =ni(y1. — ¥.)° + n2(F2. — 9), 
SSw = (nm, — 1)S7 + (m2 — 1)S5, 
dfx = 1, and dfw =n, +n2 —-2, 


where 
ny 
Yon — jy) 
fa 
; ny — ] 
and 
n2 
S02; — jr) 
i=1 
ae 
fi nz — 1 
Furthermore, noting that 
My. + N22. 
ny +n? 
we have 
Gics ¥ = n3(y1. = ¥2)? 
= (nj +12)? 
and 
Pe nia, — Y2.)? 


(ny +n2) 


Now, making the substitution, SSz can be written as 


SS 
af (ny +72) 


nino(1. — ¥2.) 
ny +n2 


_ nyn3(v1, — y2.)* + nine. — y2.)" 


583 


584 The Analysis of Variance 


Finally, the statistic F can be written simply as 


ee 1 1 

(V1. — ye.) oaaaae Uieams 
ae my "2 
~ {(ny — 1)S? + (nz — 183} /(. + 2 — 2) 
— a. - jn)? 


ay cn, 
PA\n, np 


where 


gz — Mu = DST + M2 — Sz 
P njytno—2 


Since a two-sample tf statistic with n, + nz — 2 degrees of freedom is defined 
by 


— lv. — Ya. | 


it follows that F = t?. 


P EQUIVALENCE OF THE ANOVA FAND PAIRED t TESTS 


In this appendix, it is shown that in a randomized block design with two treat- 
ments, the analysis of variance F test is equivalent to the paired ¢ test. Consider a 
randomized block design with n blocks and ¢ treatments. Let y;; be the observa- 
tion from the 7-th treatment and the j-th block (i = 1, 2,...,f; 7=1,2,..., 7). 
The F statistic for testing the hypothesis of the equality of treatment means is 
defined by 


_ MS, _ SS,/df; 
~ MSe — SSg/dfr’ 


where 
f 
SS, =n )°(i.- 9.) 
i=l 


t n 
SSe = 0) 04 — H.-H +9), 


i=1 j=l 


df.=t—1, and dfg=(t—1)\(n—1). 


Appendices 585 


For the case of two treatments (1.e., tf = 2), it follows that 
SS, = nl(1. — 5.) + 2. — 9.71, 
SSz = > [ong — Wn. — PF +H.) +O — 2, -— HF + 5.71, 
j=l 


df, = 1, and dfg =n-—1. 


Furthermore, noting that 


ee) 2) 
J 7 
and 
ee cage: 
y., i. ese 9) 5) 
we have 
: _ (¥1. — Yo.) 
(W.-Y = gee 
7 _ (1. — 92)? 
(J2-pyY = oer eae 
7 7 7 (v1; — yas) — On. — 921? 
(mj —W.-— V+.) = ee 
and 


(0 — 5a. 5 + 57% = w= Gr = aT" 
J J : 


7 4 

Now, making the substitution, SS; and SS¢ can be written as 

n 
SS; = =(h1. — 52)’ 

2 

and 
le 2 ol 

SSr = 5 dou — yoj) — Gr. — FadV. 
Again, letting dj = y1j — yoj,d = )0-_, dj/n=51. — 52, we obtain 


n- 
SS, = —(dy’ 
5 (A) 


586 The Analysis of Variance 


and 
le a5 
SSe = 5 Gi a 
jJ=1 
Finally, the statistic F can be written simply as 
=(d)"/1 
aera idea 
le : 
5 D(a — dn - 1) 
j=l 
_ dy 
_ s3 / n 
where 
Yoda; — dy 
2a | 
4 n—1 


Since a paired ¢ statistic with n — 1 degrees of freedom is defined by 


i d 
Sal/n- 


it follows that F = t?. 


Q_ EXPECTED VALUE AND VARIANCE 


If X is a discrete random variable with probability function p(x), the expected 
value of X, denoted by E(X), is defined as 


CO 


E(X) = )_ x; p(x), 


i=] 
provided that )°7~, x; p(x;) < oo. If the series diverges, the expected value 


is undefined. If X is a continuous random variable with probability density 
function f(x), then 


E(X)= [ xf (x) dx, 


Appendices 587 


provided that ee |x| f(x) dx < oo. If the integral diverges, the expected value 
is undefined. E(X) is also referred to as the mathematical expectation or the 
mean value of X and is often denoted by j. 

The expected value of a random variable is its average value and can be 
considered as the center of the distribution. For any constants a, b;, b2,..., bx, 
and the random variables X,, X2,..., Xx, the following properties hold: 


E(a)= a, 
E(b; X;) = 6, E(X;), 


and 
k k 
eas nx =a+ b; E(X;). 
i=l i=1 


If X is a random variable with expected value E(X), the variance of X, 
denoted by Var(X), is defined as 


Var(X) = E[X — E(X)I’, 


provided the expectation exists. The square root of the variance is known as 
the standard deviation. The variance is often denoted by o? and the standard 
deviation by o. 

The variance of a random value is the average or expected value of squared 
deviations from the mean and measures the variation around the mean value. 
For any constants a and b, and the random variable X, the following properties 
hold: 


Var(a) = 0, 
Var(bX) = b” Var(X), 
and | 


Var(a + bX) = b? Var(X). 


R COVARIANCE AND CORRELATION 


If X and Y are jointly distributed random variables with expected values F(X) 
and E(Y), the covariance of X and Y, denoted by Cov(X, Y), is defined as 


Cov(X, Y) = E{(X — E(X))Y — E(Y))). 


The covariance is the average value of the products of the deviations of the 
values of X from its mean and the deviations of the values of Y from its mean. 


588 The Analysis of Variance 


It can be readily shown that 
Cov(X, Y) = E(XY)— E(X)E(Y). 


The variance of a random variable is a measure of its variability and the 
covariance of two random variables can be considered as a measure of their 
joint variability or the degree of association. For any constants a, b, c,d, and 
the random variables X, Y, U, V, the following properties hold: 


Cov(a, x)= 0, 
Cov(ax, bY) = abCov(x, Y), 


and 


Cov(iaX + bY, cU + dV) = ac Cov(x, U)+ ad Cov(X, V) 
+ bc Cov(Y, U) + bd Cov (Y, V). 


In general, for any constants a;’s, b;’s, and the random variables X;’s and Y;’s 
@=1,2,...,k; j = 1,2,..., 2), the following relationships hold: 


£ k 
com Yak 1) = YS) > ab; Cov (X;, Y;), 
jal : : 


and 


k k 
ver() ax: = a? Var(X;) + 2 ) ) ajajr Cov(X;, X;). 
i=l i poo! 


i=l I l 
i< i’ 


If X and Y are jointly distributed random variables, the correlation of X and 
Y denoted by p= 1s defined as 


_ Cov( xX, Y) 
= Var(XVantY) 
The correlation can be considered as the standardized covariance. Correlation 


equals covariance if both variables measure the standardized scores with unit 
variances. It can be shown that —1 <p <1. 


S RULES FOR DETERMINING THE ANALYSIS 
OF VARIANCE MODEL 


In this appendix, we outline rules for determining the analysis of variance 
model in a balanced experimental layout. The rules are applicable to crossed 


Appendices 589 


classifications containing an equal number of observations for each combination 
of factor levels. They are also applicable to completely nested classifications 
as well as crossed-nested classifications containing an equal number of levels 
for each nested factor. We illustrate the rules with a three-factor crossed-nested 
classification where factors A and C are crossed and factor B is nested within 
factor A and crossed with factor C. We assume that the factor A has a levels, 
the factor B has b levels, the factor C has c levels, and there are n replications. 


Rule 1. Each model contains a general constant or overall mean to be denoted 
by pL. 


Rule 2... Each model contains a main effect for each factor which is denoted 
by the corresponding Greek letter with a suffix indicating the level of the fac- 
tor. If a factor is nested within another factor, the nesting is indicated using 
the parenthesis notation for its suffix. For the example being considered, the 
main effects for factors A, B, and C are: a;, B;(i), and y%, i = 1,..., a; 
Gs. ba8(0 Stl cewek. 


Rule 3. Each model contains interaction terms corresponding to all crossed 
factors. There are no interaction terms for those factors containing both a nested 
factor and the factor within which it is nested. For the example being consider- 
ed, there are A x C and B x C interactions. However, there are no A x B 
and A x B x C interactions since factor B is nested within factor A. The 
interaction terms in the model are denoted by the combination of the Greek 
letters enclosed within a pair of parentheses followed by subscripts indicating 
the levels of the factors being crossed. For the example being considered, the 
model terms for A x C and B x C interactions are: (wy), and (ay)jz, i = 
Lteste@e 7 Sly oh aD = Ihde 


Rule 4. Interactions between a nested factor and another factor with which 
the nested factor is crossed are always themselves nested. In the example being 
considered, factor B is nested within factor A and is crossed with factor C; 
thus, the B x C interaction 1s considered as nested within factor A. If an inter- 
action term is nested within another, the nesting is indicated by the parenthesis 
notation for its suffix. For the example being considered, the fact that (By) jx 1s 
nested within the levels of factor A is indicated by the parenthesis notation as 


(BY) jKG), ge — | EP ¢ 2 Pile seeded: | — ak ae! op 


Rule5. The final term in the model is the error term which is considered nested 
within all the factors. For the example being considered, the model term for the 
error is denoted by egijjx,, i = 1,...,€; j= 1,...,b5 k= 1,...,¢5 = 
| eeereee ie 


Rule 6. The final model is written as an algebraic equation between the 
response variable denoted usually by the Roman letter x or y (with a suffix 


590 The Analysis of Variance 


comprising all the subscripts) in the left-hand side and the sum of all the model 
terms appearing in the right-hand side. For the example being considered, the 
model is: 


Yijee = M+ aj + By + Ve + (OV dik + (BY) jy + Ceajk 


fo) > 
I 


T RULES FOR CALCULATING SUMS OF SQUARES 
AND DEGREES OF FREEDOM 


In this appendix, we outline rules for calculating sums of squares and degrees 
of freedom in an analysis of variance model. The rules are applicable to crossed 
classifications containing an equal number of observations for each combination 
of factor levels. They are also applicable to completely nested classifications as 
well as crossed-nested classifications containing an equal number of levels for 
each nested factor. We illustrate the rules with the three-factor crossed-nested 
classification considered in Appendix S. 


Rule 1. Write the model equation following the rules outlined in Appendix S. 
For the example being considered, the model equation is 


i 
Yijke = +0; + Byiy + Ve + (AY Dik + (BY) jy + Ceajn 4 
£ 


Rule 2. For each model term (except the general constant) write a symbolic 
product consisting of the subscripts of the term, using the subscript alone if it 
is in parentheses and subscript minus | if it is not in parentheses. Expand the 
symbolic product algebraically. For example, the symbolic product for a; 1s 
i — 1, for (By) jay itis i(j — 1)(k — 1) =1jk —ij —ik +i, and so on. 


Rule 3. The typical expression to be squared and summed for obtaining the 
sum of squares associated with a model term consists of algebraic means in- 
dexed by subscripts of the symbolic product determined by Rule 2 and dots for 
the subscripts missing in the symbolic product. The number 1 is replaced by 
the suffix containing all the dots and designates the grand mean. In the exam- 
ple being considered, the symbolic product for a; is i — 1 and the algebraic 
expression to be squared and summed is y;... — y..... The symbolic product for 
(BY) jeay 8 iG — 1k — 1) = ijk — ij — ik +1, and the algebraic expression 
to be squared and summed is Yjjx. — Yij.. — Viz. + Yi.... and So on. 


Appendices 591 


Rule 4. The sum of squares for a model term is obtained by squaring the 
algebraic expression formed by Rule 3, summing it over the subscripts in the 
model term, and then multiplying it by the product of the number of levels 
corresponding to the subscripts not appearing in the suffix of the model term. 
For the example being considered, the sum of squares for a; is obtained by 
squaring (¥;.. — y....), Summing over i, and then multiplying it by bcn; that is, 
ben ya Cin. — y._)*. Similarly, the sum of squares for (By) jk) 1S Obtained 
by squaring (Vj jx. — Vij. — Vix. + Yj...), Summing over (i, j,k), and then mul- 
tiplying it by n; that is, n 77) Day Depa ik. — Jaz. — Jia. + 5i...)”, and 
sO on. 


Rule 5. The sum of squares for the general constant is obtained by squaring 
the grand mean (yj...) and then multiplying it by the total number of observations 
(abcn), that is, abcn(y....)* . This sum of squares is usually not included in the 
analysis of variance table. 


Rule 6. The total sum of squares is obtained by squaring the deviations of the 
observations from the grand mean, and then summing it over all the subscripts, 


that is, i=i yA eat Deni Vike = yy : 


Rule 7. The degrees of freedom corresponding to a sum of squares are cal- 
culated by replacing the subscripts in the symbolic product formed by Rule 2 
by the number of levels for that subscript. For the example being considered, 
the degrees of freedom corresponding to the sum of squares for a; 1s obtained 
by replacing i by a in the symbolic product i — 1; that is, a — 1. Similarly, for 
(BY) jx) the symbolic product is i(j — 1)(k — 1) and the corresponding degrees 
of freedom are a(b — 1)(c — 1). 


Rule 8. The number of degrees of freedom for the general constant is one, 
and the total number of degrees of freedom is defined as one less than the total 
number of observations. 


U RULES FOR FINDING EXPECTED MEAN SQUARES 


Determination of expected mean squares in an analysis of variance model is 
essential in order that appropriate mean squares may be used to construct an 
F statistic for a particular hypothesis of interest. They are also important for 
finding estimators of the variance components. Although they are not difficult to 
obtain, it is evident from our previous treatment that the derivation of expected 
mean squares for various models can be tedious, involving an inordinate amount 
of time and effort. In this appendix, we outline rules for finding expected mean 
Squares in an analysis of variance model. The rules are applicable to crossed 
classifications containing an equal number of observations for each combination 
of factor levels. They are also applicable to completely nested classifications 


592 The Analysis of Variance 


as well as crossed-nested classifications containing an equal number of levels 
for each nested factor. We illustrate the rules with a three-factor crossed-nested 
classification considered earlier in Appendices S and T. It should be noted 
here that in the determination of rules for the analysis of variance model and 
the calculation of sums of squares and degrees of freedom, it does not matter 
whether factor effects are fixed or random. However, this is not so in finding 
expected mean squares and we now assume that factors A and C are fixed 
whereas factor B is random. 


Rule 1. Write the mathematical model following the rules given in Appendix 
S, including the assumptions for fixed and random effects. For the example 
being considered, the mathematical model is 


cs 
Yijke = +; + By) + Ve + (@Y dik + (BY) jain + Ceci i. 1 


where the a@;’s, y's, and (ay);,’s are assumed to be constants subject to the 
restrictions: 


a Cc 
Y > a; = yo v = 0, 
k=l 


i=1 


Yay ik = )_@Y dik = 0. 
i=l k=l 


We further assume that the Bj”jy’s, (BY) jx@i)’s, and egjx)’s are normally dis- 
tributed with zero means and variances Op, Ta)? and a7, respectively; and the 
three groups of random variables are pairwise independent. The random effects 
(BY) jx@)’S, however, are correlated due to the following restrictions: 


Y (By) «iy = 0 for all j(é). 
k=1 


Rule 2. Construct a two-way row x column table where there is a row for 
each component term in the model including the error term (except the general 
constant) and there is a column for each of the subscripts that appear in the 
model. The particular order of rows and columns is immaterial, but it helps 
to maintain some systematic scheme in order to avoid any mistakes. For the 
example being considered, the two-way table is constructed as follows: 


Appendices 


593 


ed 

Bj(i) 

Yk 
(ry ix 
(BY )jK (i) 
© e(ijk) 


Rule 3. In each row where one or more subscripts are in parentheses, write | 
in the columns corresponding to the subscripts in parentheses. For the example 
being considered, the two-way table now appears as follows: 


i j k £ 
Qj 
By) I 
Yk 
(ay) jk 
(BY) jk (i) I 


ée rA( ijk) 1 1 1 


Rule 4. In each row where one or more subscripts are not in parentheses: 


(i) write 1 in the columns corresponding to subscripts not in parentheses if 
the subscript represents a random factor; 

(ii) write 0 in the columns corresponding to subscripts not in parentheses if 
the subscript represents a fixed factor. 


For the example being considered, the two-way table now appears as follows: 


i j k l 
Qa; 0 
Bj(i) I I 
Vk 0 
(ary) ix 0 0 
(B-Y) jk (i) | | 0 
eo iik) 1 1 1 1 


Rule 5. Write the number of levels corresponding to the column subscripts in 
the remaining cells that are still vacant. For the example being considered, the 


594 The Analysis of Variance 


two-way table now appears as follows: 


i j k L 
Qj 0 b Cc n 
B (i) ] ] C n 
Vk a b 0 n 
(ay) jk 0 b 0 n 
(BY) jk (i) l | 0 n 
€ (ijk) 1 ] ] n 


Rule 6. Each fixed effect has the effects parameter defined by the sum of 
squared effects divided by its degrees of freedom. Each random effect has 
the effects parameter defined by the corresponding variance component. For 
every model term representing a fixed effect, let A = ® designate the effects 
parameter. For every model term representing a random effect, let 4 = 07 be 
the variance component for the random effect. Write all the A parameters in the 
last column to the night of the two-way table where each 4 parameter appears on 
the same line as its corresponding model term. The two-way table now appears 
as follows: 


i j k L A 

Qj 0 b Cc n (a) 
2 

B (i) | 1 Cc n Fa) 

Yk b 0 n P(y) 

(ay) ik 0 b 0 n P(ay) 
2 

(BY) jk(i) | I ” ‘i By(a) 

€ (ijk) 1 1 1 n a? 


Rule 7. The expected mean square corresponding to any model term 1s ob- 
tained as a linear combination of 4 parameters as determined by Rule 6 with 
the coefficients determined as follows: 


(i) The coefficient of the 1 parameter is zero if the subscript(s) of the model 
term in that row (whether in parentheses or not) do not include all of 
the subscripts (including those in parentheses) in the suffix of the model 
term whose expected mean square is being evaluated. 

(11) The coefficients of the A parameters that are not defined as zero by 
Rule 7(1) are determined by first deleting the columns corresponding to 
the subscript(s) not in parentheses of the model term whose expected 
mean square 1s being evaluated and then multiplying the entries of the 
remaining columns. 


Appendices 595 


For the example being considered, the coefficients of the A parameters for 
different model terms are given as follows: 


Expected Mean Square of 


A B(A) C AC BC(A) Error 


r [oj] [By] tye) Movil (By jqinl  leeijn] 
(a) bn 0 0 0 0 0 
of, a cn cn n 0 0 0 
(7) 0 0 abn 0 0 0 
P(a’y) 0 0 0 bn 0 0 
on a) 0 0 0 n n 0 
a2 1 1 1 1 1 1 


Finally, from the preceding table, the expected mean squares are given by 


E(MS4) = 0, + cnopiqy + bn O(a), 
E(MSa,a)) = a; te CNO Bq); 
E(MSc) = 0, + Noga) + abn ®y), 
E(MSac) = 07 + nogyq) + bn Pay), 
E(MSzcva)) = o; + NOB, (a)> 


and 


E(MS;z) = 2. 


V SAMPLES AND SAMPLING DISTRIBUTION 


The major objective of any statistical analysis is to make inferences about the 
parameters of the population(s) under study. If the population is finite and con- 
tains only a small number of items or individuals, then it would be ideal to 
include every member of the population to record or examine the characteris- 
tic(s) of interest. However, most populations of interest are either infinite or 
too large so that it is not feasible in terms of time and money to include every 
member of the population in the study. Hence, in order to study such a popu- 
lation, the investigator carefully draws a sample, which is much smaller than 
the population, to examine its properties and then generalizes the results of the 
sample to the population of interest. The process of generalizing the results of 
a sample to the population is called statistical inference. 

The basic requirement of a sample is that it should be representative of 
the population under study. However in general, it is difficult to obtain a 
representative sample. The usual procedure is to select a sample that 1s random. 


596 The Analysis of Variance 


The concept of randomness is intended to ensure that individual biases do not 
influence the selection of sample values. In addition, the randomness makes 
it possible to apply the laws of probability in drawing statistical inferences. A 
random sample is usually drawn with the help of a mechanical process such as 
throwing a coin or spinning a roulette wheel. The mechanical process generally 
used to obtain a random sample involves the use of a table of random numbers 
(see Statistical Tables XXIV). In addition, a great variety of computer programs 
exists for obtaining a random sample. The standard procedures for obtaining a 
random sample from a finite population using random numbers are discussed in 
most introductory statistics textbooks (see, e.g., Snedecor and Cochran (1989, 
Section 1.9)) and are not described here. 

For the purpose of statistical inference discussed in appendix W, we assume 


that we have a random sample (x1, x2, ..., X,) of a given size n where x; is the 
observed value of a certain characteristic X on the i-th member of the sample. 
We then calculate some function T(x), x2, ..., X,) of the random sample, called 


a Statistic. We repeat the procedure for every possible samples of size n that can 
be drawn from the population. Now, the successive samples will differ from 
one another and will lead to different values of the statistic 7. Using a random 
mechanism, we can draw repeated samples, calculate the value of T for each 
sample, and derive a frequency distribution of the statistic T. Such a distribution 
of T is known as the sampling distribution of T. The following figure shows a 
schematic representation of the sampling distribution. 


Ist sample 


2nd sample 


etc. 


Parent Population: Distribution of X Sampling Distribution of T 


W METHODS OF STATISTICAL INFERENCE 


The objective in an analysis of variance procedure is to make statistical infer- 
ences about the unknown parameters of the linear model. The main procedures 
for making inferences are hypothesis testing and point and interval estimation. 
In this appendix, we briefly summarize basic concepts of each procedure. 


Appendices 597 


HYPOTHESIS TESTING 


In hypothesis testing the investigator is interested in a particular value of an 
unknown parameter and wants to employ a statistical test to determine whether 
the data are consistent with the hypothesized value. The particular hypothesis to 
be tested is referred to as the null hypothesis and is denoted by Hp. In addition, 
there is another hypothesis which is a complement of the null hypothesis to be 
concluded if the null hypothesis is found to be false. The complement of the 
null hypothesis is referred to as the alternative hypothesis and is denoted by 
Ay. 

Working provisionally on the assumption that Hp is true, a test statistic 1s 
calculated from the data, as an index or measure, which is sensitive to depar- 
tures from Ho. Extreme values of the test statistic are unlikely to occur if Ho 
is true and consequently lead to its rejection as a statement of the true value of 
the parameter. 

A statistical test cannot prove that a hypothesis is true or false. Even when Ho 
is true, sampling variation can produce a very large or very small value of the test 
statistic and the investigator may be tricked into rejecting a true null hypothesis. 
The act of rejecting a true null hypothesis is called a type J error. The probability 
of making a type I error is termed the significance level and is denoted by a. 
Similarly, even when Hp is false (H; is true), sampling variation can produce a 
very small value of the test statistic and the investigator may be tricked into not 
rejecting (accepting) a false null hypothesis. The act of not rejecting a false null 
hypothesis is called a type II error. The probability of making a type II error is 
denoted by 6 and 1 — B is termed as the power of the test. 

The critical region of a test statistic is made up of extreme values of the test 
statistic such that Hp is rejected if the test statistic falls in the critical region. 
The boundaries of a critical region are determined such that the probability of 
rejecting a given null hypothesis is just equal to the chosen level of significance. 
For a given value of a, the boundary values of a critical region are called criti- 
cal values. The value of the level of significance is entirely optional although 
a-values of 0.05 and 0.01 are frequently used. The p-value is defined as the prob- 
ability of obtaining a value of the test statistic which is more extreme (greater 
or smaller) than the value calculated from the sample data. The p-value being 
a probability ranges between O and 1. If the p-value is very small, we prefer 
alternative hypothesis Hj; that is, we reject Ho in favor of H,. Conversely, if 
the p-value is large, we naturally prefer the null hypothesis; that is, we do not 
reject (accept) Hp. Note that a is the maximum p-value at which we decide to 
reject Ao. 

The steps in hypothesis testing can be summarized as follows: 


(1) The hypothesis under consideration is formulated by specifying the null 
and alternative hypotheses. 

(2) A value of the level of significance (@) 1s chosen in advance. The most 
common values of a are 0.05 and 0.01. 


598 The Analysis of Variance 


(3) The test statistic for the problem is selected and its value for the sample 
data is calculated. 

(4) The sampling distribution of the test statistic under the assumption 
of the parent distribution of the study population is determined. The 
most common sampling distributions of a test statistic are t, x”, and F 
distributions. 

(5) The critical value(s) corresponding to the chosen value of @ in step 2 1s 
(are) determined from the theoretical values of the sampling distribution 
of the test statistic identified in step 4 and the associated critical region 
is defined. 

(6) The null hypothesis Ho is rejected or accepted depending upon whether 
the value of the statistic calculated in step 3 falls inside or outside the 
critical region. 

(7) The p-value is calculated and reported. 


In a more formal statistical procedure, a sample size is chosen 1n advance that 
guarantees an acceptably high statistical power (e.g., 1 — B = 0.90) of rejecting 
Hp at a given level of significance (e.g., a = 0.05). 

A statistical test is called exact if its level of significance is exactly equal 
to a given value of a. Often, it is not possible to obtain a test with a level of 
significance exactly equal to a, and then the test is referred to as an approximate 
test. An approximate test with the level of significance less than or equal to 
a is called a conservative test. Similarly, an approximate test with the level 
of significance greater than or equal to @ is called a liberal test. In general, 
conservative tests are often preferred when only approximate tests are available. 
However, if it is known that the actual level of significance of a liberal test is 
not much greater than a, the liberal test can be recommended. 


POINT ESTIMATION 


In point estimation of a parameter, a selected function of the sample values, 
known as an estimator, 1s used to make the best guess we can concerning the 
unknown value of the parameter. The idea of the “best” guess is that the esti- 
mator yields a sample value which in some sense is close to the value of the 
unknown parameter. The observed numerical value obtained by using an esti- 
mator for a given sample is called an estimate. Since an estimate will assume 
different values for different samples; it will be close to the true parameter 
value for some samples and will be far from the parameter value for other 
samples. 

Statistical theory uses various criteria to judge the “goodness” or “merit” of 
an estimator. One desirable criterion or property used for this purpose is that 
of unbiasedness. An estimator is said to be unbiased if its average or expected 
value is equal to the parameter being estimated. More precisely, an estimator 
6, of a parameter 6 is unbiased if 


E(6,,) = 0. 


Appendices 599 


An example of an unbiased estimator is the sample variance defined by 


ee — xy’ 
52 = i=! 


? 


n—1 


which is an unbiased estimator of the population variance o”. To see that S? 
is an unbiased estimator of 07, one can write down every possible sample of 
size n which could be selected from a population, and compute S? for each 
sample. If we calculate the average value of S*’s from all possible samples, we 
would get o*. Obviously, one cannot enumerate every possible sample when 
the population is infinitely large, but one can derive the property of an unbiased 
estimator from the sampling distribution of the estimator. 

Another desirable property of an estimator is that of consistency. An estimator 
is said to be consistent if it approximates more closely the true parameter value 
with increasing sample size. More precisely, an estimator 9, is a consistent 
estimator of 0 if for any positive real number € 


lim P(\6, —0| >€)=0. 
n—oOo 


A further desirable property of an estimator is that of efficiency. An estimator 
is said to efficient if it has minimum variance! in the class of all unbiased esti- 
mators. A minimum variance unbiased (MVU) estimator is frequently referred 
to as the “bes?” estimator. 


INTERVAL ESTIMATION 


In many studies it is generally not enough to obtain just a single value as an 
estimate for the unknown parameter. It is generally required to specify an index 
or measure of reliability or uncertainty associated with the estimate. A point 
estimate of a parameter provides no such information. A method of estimation 
known as interval estimation does provide this kind of information. In an interval 
estimation of an unknown parameter @, an interval with endpoints 6, and 6y is 
constructed such that 


P(6, <0 < 6y)=1—-—a. (W.1) 


The quantity 1 — @ in equation (W.1) is known as the confidence coefficient or 
the level of confidence. The typical values of a confidence coefficient are 0.99, 
0.95, and 0.90, although other values can also be chosen. 


| The variance of an estimator provides a measure of the sampling error that describes the uncer- 
tainty of inference based on a particular sample. The square root of a variance estimator is called 
the standard error of an estimator. 


600 The Analysis of Variance 


A confidence interval given by equation (W.1) is called exact if the strict 
equality holds. Often, the equality relationship holds only approximately and 
then the interval is referred to as an approximate interval. To emphasize the fact 
that an interval is approximate, equation (W.1) is written as 


P(6, <9 < 6y)=1—a. (W.2) 
An approximate interval (W.2) is called conservative if 

PO, <0 <6y)>1-«a. (W.3) 
Similarly, an approximate interval (W.2) is called liberal if 

P(6, <0 <6y)<1-—«a. (W.4) 


In general, conservative intervals are preferred when only approximate intervals 
are available. However, if it is known that the actual confidence coefficient of 
a liberal interval is not much lower than 1 — a@, the liberal interval can be 
recommended. 

The interval given by equation (W.1) is called a two-sided confidence interval 
because it has both lower and upper endpoints. In many situations an investigator 
is interested in an interval with only one endpoint. An interval with only one 
endpoint is referred to as a one-sided interval. A one-sided interval that satisfies 
the equation 


PO, <9 <w)=1-a. 


is called an upper confidence interval. Similarly, an interval that satifies the 
equation 


P(—0o0 < 0 < 6y)=1-a. 


is called alower confidence interval. In this volume, we only consider two-sided 
confidence intervals. However, one-sided intervals can be readily obtained from 
two-sided intervals with only a minor modification. 

As with estimators statistical theory uses several criteria to judge the good- 
ness or merit of an interval. Again, a desirable criterion or property of an 
interval is that of unbiasedness. A confidence interval is said to be unbiased 
if the probability of containing any value not equal to the true of value of @ 1s 
less than or equal to 1 — a. Another desirable property of an interval 1s that of 
uniformly most accurate (UMA). A confidence interval is said to be uniformly 
most accurate if the interval has a smaller probability of containing a value 
not equal to 9 than any other interval with confidence coefficient 1 — a. A 


Appendices 601 


further desirable property of an interval is that of uniformly most accurate un- 
biased (UMAU). A confidence interval that is uniformly most accurate within 
the class of all unbiased confidence intervals is called a uniformly most ac- 
curate unbiased confidence interval. A final desirable property of an interval 
is that of uniformly shortest length (USL). A confidence interval is said to be 
uniformly shortest length if it has shorter than or shortest expected length of 
any other interval with confidence coefficient 1 — a. Generally, if a two-sided 
confidence interval is UMA (UMAU), then the expected length is shortest 
within the class of all (unbiased) confidence intervals. For a detailed and rig- 
orous discussion of the properties of confidence intervals, see Graybill (1976, 

Section 2.9). 


X SOME SELECTED LATIN SQUARES 


This appendix contains some more representations of Latin Squares from 3 x 3 
to 12 x 12 


3x3 4x4 
1 2 3 4 
ABC ABCD ABCD ABCD ABCD 
BCA BADC BCDA BDAC BADC 
CAB CDBA CDAB CADB CDAB 
DCAB DABC DCBA DCBA 
5x5 6x6 7X7 
ABCDE ABCDEF ABCDEFG 
BAECD BFDCAE BCDEFGA 
CDAEB CDEFBA CDEFGAB 
DEBAC DAFECB DEFGABC 
ECDBA ECABFD EFGABCD 
FEBADC FGABCDE 
GABCDEF 
8 x 8 9x9 10x10 
ABCDEFGH ABCDEFGHI ABCDEFGHIJ 
BCDEFGHA BCDEFGHIA BCDEFGHIJA 
CDEFGHAB CDEFGHIAB CDEFGHIJAB 
DEFGHABC DEFGHIABC DEFGHIJABC 
EFGHABCD EFGHIABCD EFGHIJABCD 
FGHABCDE FGHIABCDE FGHIJABCDE 
GHABCDEF GHIABCDEF GHIJABCDEF 
HABCDEFG HIABCDEFG HIJABCDEFG 
IABCDEFGH IJABCDEFGH 


JABCDEFGHI 


602 


Wx 11 


ABCDEFGHIJK 
BCDEFGHIJKA 
CDEFGHIJKAB 
DEFGHIJKABC 
EFGHIJKABCD 
FGHIJKABCDE 
GHIJKABCDEF 
HIJKABCDEFG 
IJKABCDEFGH 
JKABCDEFGHI 
KABCDEFGHIJ 


The Analysis of Variance 


12 x 12 


ABCDEFGAHIJKL 
BCDEFGHIJKLA 
CDEFGHIJKLAB 
DEFGHIJKLABC 
EFGHIJKLABCD 
FGHIJKLABCDE 
GHIJKLABCDEF 
HIJKLABCDEFG 
IJKLABCDEFGH 
JKLABCDEFGHI 
KLABCDEFGHIJ 
LABCDEFGHIJK 


Source: Cochran and Cox (1957, pp. 145-146). Used with permission. 


Y SOME SELECTED GRAECO-LATIN SQUARES 


This appendix contains some more representations of Graeco-Latin squares 
from 7 x 7 to 12 x 12. 


7X7 8x8 
Aa Be Cg De Ey Fy Gs Aw Be Cg Dy E, Fs Go He 
Ba Ce Dy Ey, Fs Ga Ae By Ap Ga Fy, Ay De Ce Es 
Cy Dy, Es Fa Ge Ap Be Cy Gs An Ev Dg He Be Fo 
Ds Ey Fe Gg Age By Cy Ds Fy Ez Ae Co Ba Hy, Gg 
E, Fg Ge Ay By, Cs Da E; Hy Do Cs At Gy Fp By, 
Fe Gy, An Bs Ca De Eg Fz D, Hs Bo Ge Ap Ey Ca 
G, As Ba C, Dp Ex Fy, G, Ce B, Hp Fa Eg As Dz 
Hp Eg Fe Ge Bs Cy Da Ay 
9x9 
Ag By, Cp D, E. Fg Gs Hy I, 
Bg Ca Ay Eg F, D, A, Is G¢ 
Cy Ag By iF Do E, IE: G; Hs 
Ds E¢ F, Gy Ay Ig Ay B, Ca 
ae F; Dg Ap le Gy Bg Cy Ar 
Fr D-, Es L, Gp Aa C. Ag B, 
G, H, Ig Ags B: Ci De Ey Fg 
Hg I, Gr B, Cs Ag Ep Fy Dy 
I, Go A, C; Ag Bs Fy, Dg Ei 


Appendices 603 


K, Le Is. Ja Gy He Eo Fe Cy Dg An Br 
Lu Ke Jy Ip Hs Gr Fy Eg Do Ca Br Ax 


Source: Cochran and Cox (1957, pp. 146-147). Used with permission. 


Z PROC MIXED OUTPUTS FOR SOME SELECTED 
WORKED EXAMPLES 


In this appendix, we include some additional outputs using SAS PROC MIXED 
for some selected worked examples given in Sections 4.16, 7.9, and 10.6. The 
outputs for these examples using PROC GLM were included in Figures 4.4, 
7.4, and 10.9. We did not include these outputs there because the methodology 
underlying these analyses has not been discussed in this volume. We hope 
that the readers with adequate background will find these results interesting 
and useful because they contain estimates of variance components using the 
maximum likelihood (ML) and the restricted maximum likelihood (REML) 
procedures. It should be remarked that for balanced designs when the analysis 
of variance estimates of variance components are nonnegative, they are identical 
to the REML estimates given here. However, the results of the F' tests for the 
fixed effects are generally not equivalent to significance tests produced by the 
PROC MIXED procedure. 


604 


DATA YIELDLOD; 


INPUT AGING MIX YIELD; 
DATALINES; 

Al 1 574 

11 564 

#1 1 550 

f1 2 524 

2 3 1055; 

PROC MIXED; 

CLASSES AGING MIX; 
MODEL YIELD=AGING; 
RANDOM MIX AGING*MIX; 
RUN; 

Class Level Information 
Class Levels Values 
AGING 2 12 
MIX 3 12 3 


| DATA GLYCOGEN; 
J INPUT TREATMNT $ RAT 


| PROC MIXED; 

CLASS TREATMNT RAT 
PREPARAT;MODEL GLY= 
TREAT; RANDOM RAT 
(TREAT) PREPARAT (RAT 
TREAT) ; RUN; 


1Class Level Information f 


Values 
C217 


Levels 


[DATA SOYBEAN; 

J INPUT BLOCK SPACING 
ISVARIETY $YIELD; 

| DATALINES; 

1 18" OM 33.6 


1CLASS BLOCK VARIETY 
SPACING;MODEL YIELD= 
VARIETY SPACING VARIETY* 
i SPACING; RANDOM BLOCK 

1 VARIETY * BLOCK , 
SPACING* BLOCK; RUN; 

Class Level Information 
#Class Levels Values 
PBLOCK 6123456 
|VARIETY 2 B OM 

FSPACING 5S 18 24 30 36 42 


Iter. 


REML Estimation Iteration History Model Fitting Information 


Iter. 


Convergence criteria met. 


Covariance Parameter Estimates (REML) 


Cov Parm 


| AGING*MIX 
| Residual 


0 1 
1 1 


Eval. 
0 1 
1 1 


Eval. 


RConvergence criteria met 


I RAT (TREATMNT) 
] PREPARAT (TREATMNT* RAT) 
i Residual 


The Analysis of Variance 


Ouput for the Worked Example in Section 4.16 
The SAS System 
The MIXED Procedure 


for YIELD 


Value 
18.0000 
~76.1951 
~79.1951 
-80.3540 
152.3902 


Description 
Observations 
Res Log Likelihood 
Akaike's Inf. Crit. 
Schwarz's Bayes. Crit. 
-2 Res Log Likelihood 


Objetive Criterion 
124,13401764 


122.98420101 0.0000000 


Tests of Fixed Effects 
NDF DDF Type III F Pr>F 
1965.80 0.0005 


Estimate Source 
182.59259259 1 2 
10.31481481 
§32.22222222 


Ouput for the Worked Example in Section 7.9 
The SAS System 
The MIXED Procedure 


REML Estimation Iteration History Model Fitting Information for GLYCOGEN 


Value 
36.0000 


Description 
Observations 

Res Log Likelihood -109.811 
Akaike's Inf. Crit. -112.811 
Schwarz's Bayes Crit. -115.055 
-2 Res Log Likelihood 219.6213 


Objective Criterion 
171.91789976 


158. 97132463 0.00000000 


Tests of Fixed Effects 
NDF DDF Type III F Pr > F 
2.93 0.1971 


Estimate Source 
36.06481481 TREATMNT 2 3 
14.16666667 
21.16666667 


Ouput for the Worked Example in Section 10.6 
The SAS System 
The MIXED Procedure 


| REML Estimation Iteration History 


I Iter. 


Eval. 


1 
3 
2 
1 


Objective 
146.43893356 
146.27969658 
146.27230705 
146.27218808 


fConvergence criteria met. 


#Covariance Parameter Estimates (REML) 
Source 


I Cov Parm 


Estimate 


Criterion 


0.00011090 
0.00000161 
0.00000000 


Model Fitting Information 


Description 
Observations 


Res Log Likelihood 


Akaike's Inf. Crit. 
Schwarz's Bayes. Crit. 
-2 Res Log Likelihood 


for YIELD 


Value § 

60.0000 
-119.083 
-123.083 
-126.907 
238.1660 


Tests of Fixed Effects 


NDF DDF Type III F Pr>F 
102.33 0.0002 
11.04 0.0001 

1.36 0.2820 


VARIETY 1 5 
SPACING 4 20 
VAR*SPA 4 20 


0.14028209 
0.00000000 
0.00000000 
4.66840543 


(iii) SAS PROC MIXED Instructions and Output for the Worked Example in Section 10.6 


Statistical Tables and Charts 


Table I. Cumulative Standard Normal Distribution 


This table gives the area under the standard normal curve from — oo to the indi- 
cated values of z. The values of z are provided from 0.00 to 3.99 in increments 
of 0.01 units. 


Examples: (i) P(Z < 1.47) = 0.9292. 
(Gi) P(Z>2.12) = 1— P(Z <2.12) = 1 — 0.9830 = 0.0170. 
Gil) P(Z < —2.51) = 1— P(Z <2.51) = 1 — 0.9940 = 0.0060. 
(iv) P(—1.21 < Z <2.68) = P(Z < 2.68) — P(Z < —1.21) = 
0.8832. 


605 


606 The Analysis of Variance 


z P(Z<2 z P(Z<2D z P(IZ<D z P(Z<2 z P(IZ<BH z P(Z<2 z P(Z<2zZ z P(Z<2) 


0.00 0.5000 0.50 0.6915 1.00 0.8413 1.50 0.9332 2.00 0.9772 2.50 0.9938 3.00 0.9987 3.50 0.9998 
0.01 0.5040 0.51 0.6950 1.01 0.8438 1.51 0.9345 2.01 0.9778 2.51 0.9940 3.01 0.9987 3.51 0.9998 
0.02 0.5080 0.52 0.6985 1.02 0.8461 1.52 0.9357 2.02 0.9783 2.52 0.9941 3.02 0.9987 3.52 0.9998 
0.03 0.5120 0.53 0.7019 1.03 0.8485 1.53 0.9370 2.03 0.9788 2.53 0.9943 3.03 0.9988 3.53 0.9998 
0.04 0.5160 0.54 0.7054 1.04 0.8508 1.54 0.9382 2.04 0.9793 2.54 0.9945 3.04 0.9988 3.54 0.9998 


0.05 0.5199 0.55 0.7088 1.05 0.8531 1.55 0.9394 2.05 0.9798 2.55 0.9946 3.05 0.9989 3.55 0.9998 
0.06 0.5239 0.56 0.7123 1.06 0.8554 1.56 0.9406 2.06 0.9803 2.56 0.9948 3.06 0.9989 3.56 0.9998 
0.07 0.5279 0.57 0.7157 1.07 0.8577 1.57 0.9418 2.07 0.9808 2.57 0.9949 3.07 0.9989 3.57 0.9998 
0.08 0.5319 0.58 0.7190 1.08 0.8599 1.58 0.9429 2.08 0.9812 2.58 0.9951 3.08 0.9990 3.58 0.9998 
0.09 0.5359 0.59 0.7224 1.09 0.8621 1.59 0.9441 2.09 0.9817 2.59 0.9952 3.09 0.9990 3.59 0.9998 


0.10 0.5398 0.60 0.7257 1.10 0.8643 1.60 0.9452 2.10 0.9821 2.60 0.9953 3.10 0.9990 3.60 0.9998 
0.11 0.5438 0.61 0.7291 1.11 0.8665 1.61 0.9463 2.11 0.9826 2.61 0.9955 3.11 0.9991 3.61 0.9998 
0.12 0.5478 0.62 0.7324 1.12 0.8686 1.62 0.9474 2.12 0.9830 2.62 0.9956 3.12 0.9991 3.62 0.9999 
0.13 0.5517 0.63 0.7357 1.13 0.8708 1.63 0.9484 2.13 0.9834 2.63 0.9957 3.13 0.9991 3.63 0.9999 
0.14 0.5557 0.64 0.7389 1.14 0.8729 1.64 0.9495 2.14 0.9838 2.64 0.9959 3.14 0.9992 3.64 0.9999 


0.15 0.5596 0.65 0.7422 1.15 0.8749 1.65 0.9505 2.15 0.9842 2.65 0.9960 3.15 0.9992 3.65 0.9999 
0.16 0.5636 0.66 0.7454 1.16 0.8770 1.66 0.9515 2.16 0.9846 2.66 0.9961 3.16 0.9992 3.66 0.9999 
0.17 0.5675 0.67 0.7486 1.17 0.8790 1.67 0.9525 2.17 0.9850 2.67 0.9962 3.17 0.9992 3.67 0.9999 
0.18 0.5714 0.68 0.7517 1.18 0.8810 1.68 0.9535 2.18 0.9854 2.68 0.9963 3.18 0.9993 3.68 0.9999 
0.19 0.5753 0.69 0.7549 1.19 0.8830 1.69 0.9545 2.19 0.9857 2.69 0.9964 3.19 0.9993 3.69 0.9999 


0.20 0.5793 0.70 0.7580 1.20 0.8849 1.70 0.9554 2.20 0.9861 2.70 0.9965 3.20 0.9993 3.70 0.9999 
0.21 0.5832 0.71 0.7611 1.21 0.8869 1.71 0.9564 2.21 0.9864 2.71 0.9966 3.21 0.9993 3.71 0.9999 
0.22 0.5871 0.72 0.7642 1.22 0.8888 1.72 0.9573 2.22 0.9868 2.72 0.9967 3.22 0.9994 3.72 0.9999 
0.23 0.5910 0.73 0.7673 1.23 0.8907 1.73 0.9582 2.23 0.9871 2.73 0.9968 3.23 0.9994 3.73 0.9999 
0.24 0.5948 0.74 0.7703 1.24 0.8925 1.74 0.9591 2.24 0.9875 2.74 0.9969 3.24 0.9994 3.74 0.9999 


0.25 0.5987 0.75 0.7734 1.25 0.8944 1.75 0.9599 2.25 0.9878 2.75 0.9970 3.25 0.9994 3.75 0.9999 
0.26 0.6026 0.76 0.7764 1.26 0.8962 1.76 0.9608 2.26 0.9881 2.76 0.9971 3.26 0.9994 3.76 0.9999 
0.27 0.6064 0.77 0.7794 1.27 0.8980 1.77 0.9616 2.27 0.9884 2.77 0.9972 3.27 0.9995 3.77 0.9999 
0.28 0.6103 0.78 0.7823 1.28 0.8997 1.78 0.9625 2.28 0.9887 2.78 0.9973 3.28 0.9995 3.78 0.9999 
0.29 0.6141 0.79 0.7852 1.29 0.9015 1.79 0.9633 2.29 0.9890 2.79 0.9974 3.29 0.9995 3.79 0.9999 


0.30 0.6179 0.80 0.7881 1.30 0.9032 1.80 0.9641 2.30 0.9893 2.80 0.9974 3.30 0.9995 3.80 0.9999 
0.31 0.6217 0.81 0.7910 1.31 0.9049 1.81 0.9649 2.31 0.9896 2.81 0.9975 3.31 0.9995 3.81 0.9999 
0.32 0.6255 0.82 0.7939 1.32 0.9066 1.82 0.9656 2.32 0.9898 2.82 0.9976 3.32 0.9995 3.82 0.9999 
0.33 0.6293 0.83 0.7967 1.33 0.9082 1.83 0.9664 2.33 0.9901 2.83 0.9977 3.33 0.9996 3.83 0.9999 
0.34 0.6331 0.84 0.7995 1.34 0.9099 1.84 0.9671 2.34 0.9904 2.84 0.9977 3.34 0.9996 3.84 0.9999 


0.35 0.6368 0.85 0.8023 1.35 O.9115 1.85 0.9678 2.35 0.9906 2.85 0.9978 3.35 0.9996 3.85 0.9999 
0.36 0.6406 0.86 0.8051 1.36 0.9131 1.86 0.9686 2.36 0.9909 2.86 0.9979 3.36 0.9996 3.86 0.9999 
0.37 0.6443 0.87 0.8078 1.37 0.9147 1.87 0.9693 2.37 0.9911 2.87 0.9979 3.37 0.9996 3.87 1.0000 
0.38 0.6480 0.88 0.8106 1.38 0.9162 1.88 0.9699 2.38 0.9913 2.88 0.9980 3.38 0.9996 3.88 1.0000 
0.39 0.6517 0.89 0.8133 1.39 0.9177 1.89 0.9706 2.39 0.9916 2.89 0.9981 3.39 0.9997 3.89 1.0000 


0.40 0.6554 0.90 0.8159 1.40 0.9192 1.90 0.9713 2.40 0.9918 2.90 0.9981 3.40 0.9997 3.90 1.0000 
0.41 0.6591 0.91 0.8186 1.41 0.9207 1.91 0.9719 2.41 0.9920 2.91 0.9982 3.41 0.9997 3.91 1.0000 
0.42 0.6628 0.92 0.8212 1.42 0.9222 1.92 0.9726 2.42 0.9922 2.92 0.9982 3.42 0.9997 3.92 1.0000 
0.43 0.6664 0.93 0.8238 1.43 0.9236 1.93 0.9732 2.43 0.9925 2.93 0.9983 3.43 0.9997 3.93 1.0000 
0.44 0.6700 0.94 0.8264 1.44 0.9251 1.94 0.9738 2.44 0.9927 2.94 0.9984 3.44 0.9997 3.94 1.0000 


0.45 0.6736 0.95 0.8289 1.45 0.9265 1.95 0.9744 2.45 0.9929 2.95 0.9984 3.45 0.9997 3.95 1.0000 
0.46 0.6772 0.96 0.8315 1.46 0.9279 1.96 0.9750 2.46 0.9931 2.96 0.9985 3.46 0.9997 3.96 1.0000 
0.47 0.6808 0.97 0.8340 1.47 0.9292 1.97 0.9756 2.47 0.9932 2.97 0.9985 3.47 0.9997 3.97 1.0000 
0.48 0.6844 0.98 0.8365 1.48 0.9306 1.98 0.9761 2.48 0.9934 2.98 0.9986 3.48 0.9997 3.98 1.0000 
0.49 0.6879 0.99 0.8389 1.49 0.9319 1.99 0.9767 2.49 0.9936 2.99 0.9986 3.49 0.9998 3.99 1.0000 


Computed Using IMSL* Library Functions. 


* IMSL (International Mathematical and Statistical Library) is a registered trade mark of IMSL, 
Inc. 


Statistical Tables and Charts 607 


Table Il. Percentage Points of the Standard Normal Distribution 


This table is the inverse of Table I. The entries in the table give z values (per- 
centiles) corresponding to a given cumulative probability (1.e., P(Z < 2Z)), 
which represents all the area to the left of the z value. The values of P(Z < z) 
are given from 0.0001 to 0.9999. 


Examples: (i) P(Z < z) = 0.005, z = —2.57583. 
Gi) P(Z < z) = 0.200, z = —0.84162. 
(ii) P(Z < z) =0.800,z= 0.84162. 
(iv) P(Z < z)=0.950,z= 1.64485. 


P(Z< 2) z P(Z< 2) z P(Z< 2) z P(Z< 2) z P(Z<z) z 

0.0001 ~ 3.71902 0.165 —0.97411 0.390 —0.27932 0.615 0.29237 0.8400 0.99446 
0.0002 —3.54008 0.170 —0.95417 0.395 —0.26631 0.620 0.30548 0.8450 1.01522 
0.0003 —3.43161 0.175 —0.93459 0.400 —0.25335 0.625 0.31864 0.8500 1.03643 
0.0004 —3.35279 0.180 —0.91537 0.405 —0.24043 0.630 0.33185 0.8550 1.05812 
0.0005 — 3.29053 0.185 —0.89647 0.410 —0.22754 0.635 0.34513 0.8600 1.08032 
0.0010 — 3.09023 0.190 —0.87790 0.415 —0.21470 0.640 0.35846 0.8650 1.10306 
0.0020 — 2.87816 0.195 —0.85962 0.420 —0.20189 0.645 0.37186 0.8700 1.12639 
0.0030 —2.74778 0.200 —0.84162 0.425 —0.18912 0.650 0.38532 0.8750 1.15035 
0.0040 —2.65207 0.205 —0.82389 0.430 —0.17637 0.655 0.39886 0.8800 1.17499 
0.0050 —2.57583 0.210 —0.80642 0.435 —0.16366 0.660 0.41246 0.8850 1.20036 
0.0060 —2.51214 0.215 —0.78919 0.440 —0.15097 0.665 0.42615 0.8900 1.22653 
0.0070 ~2.45726 0.220 —Q.77219 0.445 —0.13830 0.670 0.43991 0.8950 1.25357 
0.0080 — 2.40892 0.225 ~0.75542 0.450 —0.12566 0.675 0.45376 0.9000 1.28155 
0.0090 — 2.36562 0.230 —0.73885 0.455 —0.11304 0.680 0.46770 0.9050 1.31058 
0.0100 — 2.32635 0.235 —0.72248 0.460 —0.10043 0.685 0.48173 0.9100 1.34076 
0.0150 —2.17009 0.240 —0.70630 0.465 ~0.08784 0.690 0.49585 0.9150 1.37220 
0.0200 — 2.05375 0.245 —0.69031 0.470 —0.07527 0.695 0.51007 0.9200 1.40507 
0.0250 — 1.95996 0.250 —0.67449 0.475 —0.06271 0.700 0.52440 0.9250 1.43953 
0.0300 — 1.88079 0.255 —0.65884 0.480 —0.05015 0.705 0.53884 0.9300 1.47579 
0.0350 —1.81191 0.260 —0.64335 0.485 —0.03761 0.710 0.55338 0.9350 1.51410 
0.0400 — 1.75069 0.265 —0.62801 0.490 —0.02507 0.715 0.56805 0.9400 1.55477 
0.0450 — 1.69540 0.270 —0.61281 0.495 —0.01253 0.720 0.58284 0.9450 1.59819 
0.0500 — 1.64485 0.275 —0.59776 0.500 0.00000 0.725 0.59776 0.9500 1.64485 
0.0550 — 1.59819 0.280 —0.58284 0.505 0.01253 0.730 0.61281 0.9550 1.69540 
0.0600 — 1.55477 0.285 —0.56805 0.510 0.02507 0.735 0.62801 0.9600 1.75069 
0.0650 —1.51410 0.290 —0.55338 0.515 0.03761 0.740 0.64335 0.9650 1.81191 
0.0700 — 1.47579 0.295 —0.53884 0.520 0.05015 0.745 0.65884 0.9700 1.88079 
0.0750 — 1.43953 0.300 ~0.52440 0.525 0.0627 1 0.750 0.67449 0.9750 1.95996 
0.0800 — 1.40507 0.305 —0.51007 0.530 0.07527 0.755 0.6903 1 0.9800 2.05375 
0.0850 — 1.37220 0.310 —0.49585 0.535 0.08784 0.760 0.70630 0.9850 2.17009 
0.0900 — 1.34076 0.315 —0.48173 0.540 0.10043 0.765 0.72248 0.9900 2.32635 
0.0950 — 1.31058 0.320 —0.46770 0.545 0.11304 0.770 0.73885 0.9910 2.36562 
0.1000 — 1.28155 0.325 —0.45376 0.550 0.12566 0.775 0.75542 0.9920 2.40892 
0.1050 — 1.25357 0.330 —0.43991 0.555 0.13830 0.780 0.77219 0.9930 2.45726 
0.1100 — 1.22653 0.335 —0.42615 0.560 0.15097 0.785 0.78919 0.9940 2.51214 
0.1150 — 1.20036 0.340 —0.41246 0.565 0.16366 0.790 0.80642 0.9950 2.57583 
0.1200 — 1.17499 0.345 —0.39886 0.570 0.17637 0.795 0.82389 0.9960 2.65207 
0.1250 — 1.15035 0.350 —0.38532 0.575 0.18912 0.800 0.84162 0.9970 2.74778 
0.1300 — 1.12639 0.355 —0.37186 0.580 0.20189 0.805 0.85962 0.9980 2.87816 
0.1350 — 1.10306 0.360 ~0.35846 0.585 0.21470 0.810 0.87790 0.9990 3.09023 
0.1400 — 1.08032 0.365 —0.34513 0.590 0.22754 0.815 0.89647 0.9995 3.29053 
0.1450 — 1.05812 0.370 —0.33185 0.595 0.24043 0.820 0.91537 0.9996 3.35279 
0.1500 — 1.03643 0.375 —0.31864 0.600 0.25335 0.825 0.93459 0.9997 3.43161 
0.1550 — 1.01522 0.380 —0.30548 0.605 0.26631 0.830 0.95417 0.9998 3.54008 
0.1600 —0.99446 0.385 —0.29237 0.610 0.27932 0.835 0.97411 0.9999 3.71902 


Computed Using IMSL* Library Functions. 


* IMSL (International Mathematical and Statistical Library) is a registered trade mark of IMSL, 
Inc. 


608 The Analysis of Variance 


Table III. Critical Values of the Student’s t Distribution 


This table gives the critical values of the Student’s ¢ distribution for degrees of 
freedom v = 1 (1) 30, 40, 60, 120, co. The a-values are given corresponding 
to upper-tail tests of significance. The critical values are given corresponding to 
one-tail a-levels equal to 0.40, 0.30, 0.20, 0.15, 0.10, 0.025, 0.02, 0.015, 0.01, 
0.0075, 0.005, 0.0025, and 0.0005. Since the distribution of ¢ is symmetrical 
about zero, the one-tailed significance level of w corresponds to the two-tailed 
significance level of 2a. All the critical values are provided to three decimal 
places. 


tly, |-c] 


Examples: (i) For v= 15, a =0.01, the desired critical value from the table 
is t{15, 0.99] = 2.602. 

(ii) For v = 60, a =0.05, the desired critical value from the table 
is t{60, 0.95] = 1.671. 


Statistical Tables and Charts 


is 


A f& WN — 


From J. Neter, M. H. Kutner, C. J. Nachtsheim 


0.40 0.30 0.20 


0.325 0.727 1.376 
0.289 0.617 1.061 
0.277 0.584 0.978 
0.271 0.569 0.941 
0.267 0.559 0.920 


0.265 0.553 0.906 
0.263 0.549 0.896 
0.262 0.546 0.889 
0.261 0.543 0.883 
0.260 0.542 0.879 


0.260 0.540 0.876 
0.259 0.539 0.873 
0.259 0.537 0.870 
0.258 0.537 0.868 
0.258 0.536 0.866 


0.258 0.535 0.865 
0.257 0.534 0.863 
0.257 0.534 0.862 
0.257 0.533 0.861 
0.257 0.533 0.860 


0.257 0.532 0.859 
0.256 0.532 0.858 
0.256 0.532 0.858 
0.256 0.531 0.857 
0.256 0.531 0.856 


0.256 0.531 0.856 
0.256 0.531 0.855 
0.256 0.530 0.855 
0.256 0.530 0.854 
0.256 0.530 0.854 


0.255 0.529 0.851 
0.254 0.527 0.848 
0.254 0.526 0.845 
0.253 0.524 0.842 


0.15 


1.963 
1.386 
1.250 
1.190 
1.156 


1.134 
1.119 
1.108 
1.100 
1.093 


1.088 
1,083 
1.079 
1.076 
1.074 


1.071 
1.069 
1.067 
1.066 
1.064 


1.063 
1.061 
1.060 
1.059 
1.058 


1.058 
1.057 
1.056 
1.055 
1.055 


1.050 
1.045 
1.041 
1.036 


0.10 


3.078 
1.886 
1.638 
1.533 
1.476 


1.440 
1.415 
1.397 
1.383 
1.372 


1.363 
1.356 
1.350 
1.345 
1.341 


1.337 
1.333 
1.330 
1.328 
1.325 


1.323 
1.321 
1.319 
1.318 
1.316 


1.315 
1.314 
1.313 
1.311 
1.310 


1.303 
1.296 
1.289 
1.282 


0.05 


6.314 
2.920 
2.353 
2.132 
2.015 


1.943 
1.895 
1.860 
1.833 
1.812 


1.796 
1.782 
1.771 
1.761 
1.753 


1.746 
1.740 
1.734 
1.729 
1.725 


1.721 
1.717 
1.714 
1.711 
1.708 


1.706 
1.703 
1.701 
1.699 
1.697 


1.684 
1.671 
1.658 
1.645 


0.025 


12.706 
4.303 
3.182 
2.776 
pany gl 


2.447 
2.365 
2.306 
2.262 
2.228 


2.201 
2.179 
2.160 
2.145 
2.131 


2.120 
2.110 
2.101 
2.093 
2.086 


2.080 
2.074 
2.069 
2.064 
2.060 


2.056 
2.052 
2.048 
2.045 
2.042 


2.021 
2.000 
1.980 
1.960 


a 
0.02 


15.895 
4.849 
3.482 
2.999 
2.757 


2.612 
2.517 
2.449 
2.398 
2.359 


2.328 
2.303 
2.282 
2.264 
2.249 


2.235 
2.224 
2.214 
2.205 
2.197 


2.189 
2.183 
2.177 
2.172 
2.167 


2.162 
2.158 
2.154 
2.150 
2.147 


2.123 
2.099 
2.076 
2.054 


0.015 


21.205 
5.643 
3.896 
3.298 
3.003 


2.829 
2.715 
2.634 
2.574 
2.527 


2.491 
2.461 
2.436 
2.415 
2.397 


2.382 
2.368 
2.356 
2.346 
2.336 


2.328 
2.320 
2.313 
2.307 
2.301 


2.296 
2.291] 
2.286 
2.282 
2.278 


2.250 
2.229 
2.196 
2.170 


0.01 0.0075 0.005 


31.821 
6.965 
4.541 
3.747 
3.365 


3.143 
2.998 
2.896 
2.821 
2.764 


2.718 
2.681 
2.650 
2.624 
2.602 


2.583 
2.567 
2.552 
2.539 
2.528 


2.518 
2.508 
2.500 
2.492 
2.485 


2.479 
2.473 
2.467 
2.462 
2.457 


2.423 
2.390 
2.358 
2.326 


609 


0.0025 0.0005 


42.434 63.657 127.322 636.590 


8.073 
5.047 
4.088 
3.634 


3.372 
3.203 
3.085 
2.998 
2.932 


2.879 
2.836 
2.801 
2.771 
2.746 


2.724 
2.706 
2.689 
2.674 
2.661 


2.649 
2.639 
2.629 
2.620 
2.612 


2.605 
2.598 
2.592 
2.586 
2.581 


2.542 
2.504 
2.468 
2.432 


9.925 
5.841 
4.604 
4.032 


3.707 
3.499 
3.355 
3.250 
3.169 


3.106 
3.055 
3.012 
2.977 
2.947 


2.921 
2.898 
2.878 
2.861 
2.845 


2.831 
2.819 
2.807 
2.797 
2.787 


2.779 
2.771 
2.763 
2.756 
2.750 


2.704 
2.660 
2.617 
2.576 


14.089 
7.453 
5.598 
4.773 


4.317 
4.029 
3.833 
3.690 
3.581 


3.497 
3.428 
3.372 
3.326 
3.286 


3252 
3.222 
3.197 
3.174 
3.153 


3.135 
3.119 
3.104 
3.091 
3.078 


3.067 
3.057 
3.047 
3.038 
3.030 


2.971 
2.915 
2.860 
2.807 


31.598 
12.924 
8.610 
6.869 


5.959 
5.408 
5.041 
4.781 
4.587 


4.437 
4.318 
4.221 
4.140 
4.073 


4.015 
3.965 
3.922 
3.883 
3.849 


3.819 
3.792 
3.768 
3.745 
3.725 


3.707 
3.690 
3.674 
3.659 
3.646 


3.551 
3.460 
3.373 
3.291 


and W. Wasserman, Applied 
Linear Statistical Models, Fourth Edition, © 1996 by Richard D. Irwin, Inc., 
Chicago. Reprinted by permission (from Table B.2). 


610 The Analysis of Variance 


Table IV. Critical Values of the Chi-Square Distribution 


This table gives the critical values of the chi-square (x *) distribution for degrees 
of freedom v = 1 (1) 30 (10) 100. The critical values are given corresponding 
to a-levels equal to 0.995, 0.990, 0.975, 0.95, 0.90, 0.75, 0.50, 0.25, 0.10, 
0.05, 0.025, 0.01, and 0.005. All the critical values are provided to two decimal 
places. 


x [v, ] -a]} 


Examples: (i) For v = 15,a@ = 0.05, the desired critical value from 
the table is x7[15, 0.95] = 25.00. 
(ii) For v = 20, a = 0.90, the desired critical value from 
the table is x7[20, 0.1] = 12.44. 


Statistical Tables and Charts 


0.04393 0.07157 0.07982 0.07393 


y 0.995 
] 
2 0.01 
3 0.07 
4 0.21 
5 041 
6 0.68 
7 0.99 
8 1.34 
9 = =1.73 
102.16 
11 2.60 
12 3.07 
13. 3.57 
14 4.07 
15 4.60 
16 = 5.14 
17. 5.70 
18 6.26 
19 6.84 
207.43 
21 8.03 
22 8.64 
239.26 
24 9.89 
25 10.52 
26 11.16 
27 «11.81 
28 «12.46 
29 13.12 
30. 13.79 
40 20.71 
50 27.99 
60 35.53 
70 43.28 
80 51.17 
90 59.20 
100 67.33 


0.990 


0.02 
0.11 
0.30 
0.55 


0.87 
1.24 
1.65 
2.09 
2.56 


3.05 
3.57 
4.11 
4.66 
5.23 


5.81 
6.41 
7.01 
7.63 
8.26 


8.90 
9.54 
10.20 
10.86 
11.52 


12.20 
12.88 
13.56 
14.26 
14.95 


22.16 
29.71 
37.48 
45.44 
53.54 
61.75 
70.06 


0.975 


0.05 
0.22 
0.48 
0.83 


1.24 
1.69 
2.18 
2.70 
3.25 


3.82 
4.40 
5.01 
5.63 
6.27 


6.91 
7.56 
8.23 
8.91 
9.59 


10.28 
10.98 
11.69 
12.40 
13.12 


13.84 
14.57 
15.31 
16.05 
16.79 


24.43 
32.36 
40.48 
48.76 
57.15 
65.65 
74.22 


0.950 


0.10 
0.35 
0.71 
1.15 


1.64 
2.17 
2.73 
3.33 
3.94 


4.57 
5.23 
5.89 
6.57 
7.26 


7.96 
8.67 
9.39 
10.12 
10.85 


11.59 
12.34 
13.09 
13.85 
14.61 


15.38 
16.15 
16.93 
17.71 
18.49 


26.51 
34.76 
43.19 
51.74 
60.39 
69.13 
77.93 


0.900 


0.02 
0.21 
0.58 
1.06 
1.61 


2.20 
2.83 
3.49 
4.17 
4.87 


5.58 
6.30 
7.04 
7.79 
8.55 


9.31 
10.09 
10.86 
11.65 
12.44 


13.24 
14.04 
14.85 
15.66 
16.47 


17.29 
18.11 
18.94 
19.77 
20.60 


29.05 
37.69 
46.46 
55.33 
64.28 
73.29 
82.36 


a 
0.750 


0.10 
0.58 
1.21 
1.92 
2.67 


3.45 
4.25 
5.07 
5.90 
6.74 


7.58 
8.44 
9.30 
10.17 
11.04 


11.91 
12.79 
13.68 
14.56 
15.45 


16.34 
17.24 
18.14 
19.04 
19.94 


20.84 
21.75 
22.66 
23.57 
24.48 


33.66 
42.94 
52.29 
61.70 
71.14 
80.62 
90.13 


0.500 0.250 


0.45 
1.39 
237 
3.36 
4.35 


5.35 
6.35 
7.34 
8.34 
9.34 


10.34 
11.34 
12.34 
13.34 
14.34 


15.34 
16.34 
17.34 
18.34 
19.34 


20.34 
21.34 
22.34 
23.34 
24.34 


25.34 
26.34 
27.34 
28.34 
29.34 


39.34 
49.33 
59.33 
69.33 
79.33 
89.33 


1.32 
2.77 
4.11 
5.39 
6.63 


7.84 
9.04 
10.22 
11.39 
12.55 


13.70 
14.85 
15.98 
17.12 
18.25 


19.37 
20.49 
21.60 
22.72 
23.83 


24.93 
26.04 
27.14 
28.24 
29.34 


30.43 
31.53 
32.62 
33.71 
34.80 


45.62 
56.33 
66.98 
77.58 
88.13 
98.64 


99.33 109.14 


0.100 


2.71 
4.61 
6.25 
7.78 
9.24 


10.64 
12.02 
13.36 
14.68 
15.99 


17.28 
18.55 
19.81 
21.06 
22.31 


23.54 
24.77 
25.99 
27.20 
28.41 


29.62 
30.81 
32.01 
33.20 
34.38 


35.56 
36.74 
37.92 
39.09 
40.26 


51.80 
63.17 
74.40 
85.53 
96.58 
107.56 
118.50 


0.050 


3.84 
5.99 
7.81 
9.49 
11.07 


12.59 
14.07 
15.51 
16.92 
18.31 


19.68 
21.03 
22.36 
23.68 
25.00 


26.30 
27.59 
28.87 
30.14 
31.41 


32.67 
33.92 
35.17 
36.42 
37.65 


38.89 
40.11 
41.34 
42.56 
43.77 


55.76 
67.50 
79.08 
90.53 
101.88 


0.025 


5.02 
7.38 
9.35 
11.14 
12.83 


14.45 
16.01 
17.53 
19.02 
20.48 


21.92 
23.34 
24.74 
26.12 
27.49 


28.85 
30.19 
31.53 
32.85 
34.17 


35.48 
36.78 
38.08 
39.36 
40.65 


41.92 
43.19 
44.46 
45.72 
46.98 


59.34 
71.42 
83.30 
95.02 
106.63 


611 


0.010 0.005 


6.63 
9.21 
11.34 
13.28 
15.09 


16.81 
18.48 
20.09 
21.67 
23.21 


24.72 
26.22 
27.69 
29.14 
30.58 


32.00 
33.41 
34.81 
36.19 
37.57 


38.93 
40.29 
41.64 
42.98 
44.31 


45.64 
46.96 
48.28 
49.59 
50.89 


63.69 
76.15 
88.38 
100.42 
112.33 


113.14 118.14 124.12 


124.34 


129.56 


135.81 


7.88 
10.60 
12.84 
14.86 
16.75 


18.55 
20.28 
21.96 
23.59 
25.19 


26.76 
28.30 
29.82 
31.32 
32.80 


34.27 
35.72 
37.16 
38.58 
40.00 


41.40 
42.80 
44.18 
45.56 
46.93 


48.29 
49.64 
50.99 
52.34 
53.67 


66.77 
79.49 
91.95 
104.22 
116.32 
128.30 
140.17 


From C. M. Thompson, “Table of Percentage Points of the Chi-Square Distribution.” 
Biometrika, 32, (1941), 188-189. Reprinted by permission. 


612 The Analysis of Variance 


Table V. Critical Values of the F Distribution 


This table gives the critical values of the F distribution for degrees of freedom 
v; = 1(1) 10, 12, 15, 20, 24, 30, 40, 60, 120, oo arranged across the top of the 
table and v2 = 1 (1) 30, 40, 60, 120, co arranged along the left margin of the 
table. The w-values are given for upper-tail tests of significance. All the critical 
values are provided to two decimal places. The lower-tailed critical values are 
not given, but can be obtained using the following relation: F'[vj, v2;1—a] = 
1/F[v2, v1; a]. 


l-a 


F[v, V2;1-a] 


Examples: (i) For vy; = 6, v2 = 30,a = 0.05, the desired critical value 
from the table is F[6, 30; 0.95] = 2.42. 
(ii) For vy = 10, v2 = 60, a = 0.10, the desired critical value 
from the table is F[10, 60;0.90] = 1.71. 
(iii) For v; = 8, vo = 24, a = 0.95, the desired critical value is 
obtained as F[8, 24, 0.05] = 1/F[24, 8, 0.95] = 1/3.12 = 
0.32. 


613 


Statistical Tables and Charts 


80'L 
gos 
VIP 
PCL 
LV'@ 
83°8 
88°9 
S3'P 
LOE 
CL@ 
piel 
c0'6 
cO'9 
9t'P 
Ole 
ce 61 
or tl 
9¢'8 
tos 
OL 
tsp 
t19¢ 
061 
tS'8 
tls 
0S 661 
0S 66 
OS 6t 
0S '61 
6r'6 
S9rSc 
999 
8I0l 
Ot PSz 
te t9 


oO 


6IL 
bls 
ee 
LOY 
6h 7 
00°6 
L69 
06° 
OLE 
VL 
LoCl 
116 
LO9 
Orr 
cle 
LY 6l 
9S'tl 
I¢'8 
99°S 
BLE 
66 IP 
CC 9C 
Soc! 
gs'8 
vs 
0S 661 
6+ 66 
6r 6f 
6b 61 
8r'6 
6StS7 
6tc9 
viol 
Ot eS? 
90°€9 


OcL 


el 
c8'S 
SCV 
Ott 
IS’? 
cl6 
90°L 
961 
ble 
OL‘? 
ara | 
076 
cl9 
typ 
be 
19'6l 
sol 
9¢°8 
69'S 
6L't 
Sler 
ce 9T 
66'¢1 
LS'8 
sis 
0S 661 
8h '66 
8r'6t 
8r 6l 
LvV'6 
tScsc 
cleo 
O10! 
07 '@S7 
6L°C9 


09 


Ch L 
16'S 
ley 
pee 
SC 
v6 
pL 
10'S 
ELLE 
8LC 
eocl 
676 
81°9 
9P'P 
9I'¢ 
cL 6l 
cL el 
Iv'8 
cLS 
08't 
lecv 
Ip 9c 
bo rl 
6S°8 
91'S 
0S 661 
Lv'66 
Ly 6t 
Lv 6l 
Lv'6 
8PISc 
L8C9 
9001 
OI 1Sz 
eo°c9 


Ob 


tol 
66'S 
9'P 
Bee 
99°C 
9¢°6 
tcl 
LOS 
I8'€ 
08°¢ 
99°Cl 
8t6 
tc9 
OS't 
LIVEe 
68°61 
bse! 
9P'8 
SLs 
(6:9 
Lv'CY 
0S'9¢ 
80 tl 
c9'8 
LIS 
0S 661 
Lv'66 
Or 6t 
9r'6l 
9r'6 
PrOSC 
1979 
1001 
O1'0S¢ 
97C9 


0¢ 


SOL 
LO'9 
chp 
Ive 
85°C 
Lv'6 
lel 
cls 
p8'¢C 
C87 
8L Cl 
Lv6 
879 
top 
6l'¢ 
£002 
t6oel 
IS’8 
LL’S 
tS e 
C9'Ch 
09°9¢ 
cl vl 
¢9'°8 
81's 
0S 661 
9r'66 
9r'6t 
cr 6l 
Sv'6 
Ov6b~ 
Stc9 
0¢'L66 
Ol 6r7 
00°C9 
1Z6 


SLL 
919 
Lvp 
bre 
6S'7 
6S'6 
OV'L 
LIS 
L8'€ 
8° 
06°C 
SS'6 
tc9 
9C'P 
Ice 
L102 
cO'v I 
9°°8 
08'S 
p8't 
8L CP 
69°9¢ 
LIvl 
99'8 
81'S 
Or'661 
Sr 66 
Sr 6t 
SP 6l 
br'6 
9E8PC 
60¢9 
O1't66 
00°87 
bl 19 


02 


LOL 
1€°9 
LS'b 
IS’ 
£97? 
186 
9S°L 
LoS 
bv6't 
L8@ 
clel 
cL'6 
tv9 
cov 
on 
br OC 
0c rl 
99'8 
98°¢ 
L8¢ 
80'tP 
L8'9¢ 
cc rl 
OL'8 
0¢'S 
Or'661 
tv 66 
ty 6t 
cr ol 
cy 6 
OL9KC 
LSI9 
06°786 
06 Sb 
cc 19 


SI 


818 
Lv'9 
L9'P 
LO’ 
L9'@ 
t00l 
CLL 
Les 
00'P 
06°C 
8ce tl 
68°6 
cso 
89'P 
LCE 
OL'0¢ 
Le vl 
SL'8 
16's 
06°¢ 
6t tP 
SO'L¢ 
be vl 
bL8 
ccs 
Or'661 
Cr 66 
Iv'6t 
Iv'6l 
Iv'6 
9CPbC 
9019 
OL°9L6 
06°ChC 
IL'09 


cl 


8c°8 
c9'9 
OLY 
p9'e 
OL‘? 
sc Ol 
L8L 
OVS 
90° 
b6C 
coe 
sOOl 
c9'9 
bly 
Otc 
L602 
ccrl 
p88 
96'S 
C6t 
69th 
tc Le 
cr tl 
6L'8 
tos 
Or'661 
Or 66 
Or 6£ 
Or'6l 
6£°6 
bCCHC 
9S09 
09°896 
06 [Pz 
61°09 


OL 


IS’ 
cL9 
C8" 
89 
CLC 
6t Ol 
86L 
css 
Ol’ 
96°C 
LL tl 
9T Ol 
89°9 
LLY 
cee 
am k6 
99'tl 
06°8 
00°9 
b6't 
88°CP 
So 
LY'vl 
188 
vos 
Or'661 
6t 66 
6¢ 6f 
8c 6l 
8£ 6 
160¢7 
ccO9 
Oc £96 
0S Orc 
98°6S 


6 


74) 


898 
v8°9 
06° 
thy 
SL? 
LS‘Ol 
O18 
09'S 
oe 
867 
96'¢1 
6c OI 
9L'9 
C8 
ve’ 
ce 1% 
O8'r1 
868 
¢0'9 
S6t 
tl vp 
6b LC 
bs tl 
$88 
Sos 
Or'66I 
Lt 66 
Le 6t 
Lt 6l 
Lt 6 
CC6EC 
C86S 
OL'9S6 
06'8t7 
vr'6s 
8 


68°8 
669 
66'F 
6L¢ 
8L 7 
6L 01 
978 
OLS 
ICP 
10'€ 
Oc rl 
9r Ol 
$39 
to 
Lee 
COL? 
86 Fl 
L0'6 
60°9 
86¢ 
ty rr 
LO'LC 
co Fl 
68°8 
LoS 
Or'661 
9¢'66 
9C Ot 
cc 6l 
Sc6 
CILE? 
86S 
O0¢@ 8h6 
08'9t7 
16'8S 


Z 


91°6 
6IL 
cls 
L8'¢ 
t8°C 
LOTTI 
Lvs 
c8'S 
87 P 
so'e 
IS rl 
L901 
869 
Sor 
Ore 
L617 
Ic Sl 
0¢'6 
919 
10’ 
v8 bP 
16'Lé 
tL vl 
68 
87S 
Ot 661 
tt 66 
tt 6£ 
ec ol 
tt'6 
Ltvec 
6S8S 
OI'LE6 
OO'PET 
O0¢'8S 


9 


cS'6 
9r'L 
60'S 
L6O¢ 
83°7 
Or TI 
cL8 
66'S 
6c P 
Ile 
v6 FI 
L601 
SUL 
sos 
svt 
OPC? 
cs SI 
9¢°6 
979 
SO'r 
6¢ SP 
vC 8C 
83° rl 
106 
les 
Ot 661 
Ot 66 
Ot 6£ 
Oc 61 
606 
9SOEC 
VOLS 
08°16 
0c 0&7 
VOLS 


S 


sOOl 
SSL 
css 
(ae 
96°C 
tcO7I 
C16 
tc9 
top 
8It 
9S'SI 
6c 11 
6t°L 
61's 
cSt 
SI€? 
86 SI 
09°6 
6t°9 
IT'p 
61 9F 
IL'8¢ 
Ol'sI 
cl6 
e's 
07 661 
SC 66 
SC 6E 
Sc ol 
v2'6 
OOSCC 
Sc9s 
09°668 
09°77 
t8°ss 


v 


88°0l 
SVs 
68'S 
cer 
LOt 
c6 TI 
816 
09°9 
OLY 
6c £ 
esol 
90CI 
9L'L 
Ips 
core 
97'PC 
69°91 
866 
6¢°9 
6I't 
Lv Lt 
OP 6c 
br'sl 
8¢ 6 
6tS 
0c 661 
L166 
LI6¢ 
OT 6l 
91°6 
CI9I? 
cOrs 
07 '198 
OL'SI7 
6S ts 


€ 


Or'cl 
SS°6 
ps9 
bly 
97'E 
pS rl 
c6 Ol 
9CL 
vis 
Ort 
I¢'81 
Loel 
trys 
6L'S 
BLE 
87 9¢ 
00°81 
s90l 
v69 
ct Py 
08'6r 
c8 0b 
pool 
cs'6 
OP'S 
00°661 
00°66 
00°6¢ 
00°61 
00°6 
0000¢ 
OS '666r 
0S '66L 
0S 661 
OS '6t 


4 


vo 9l 
Stl 
LO'8 
6S’ 
6S'¢ 
t9'81 
SLetl 
18°8 
66'S 
BL 
eT 
92°91 
1001 
19°9 
90° 
tele 
0c 12 
CCCI 
ILL 
vS'P 
goss 
cle 
vr LI 
tl Ol 
pss 
$861 
0S°86 
IS'8¢ 
[S81 
to'8 
LI@91 
cSOP 
8°Lb9 
vI9l 
98°6t 


lL 


S00 
010° 
$70" 
0SO° 
0OT" 
S00° 
010° 
$cO" 
0S0° 
OOT” 
S00° 
010 
$cO 
0S0° 
0OT" 
$00" 
010 
$cO’ 
0SO0" 
OOT” 
$00" 
010 
$cO" 
0SO0° 
0) 
S00" 
O10 
$cO" 
0SO0° 
0OT 
S00 
010° 
$cO" 
0SO° 
0OT" 


2 @] 


7] 


The Analysis of Variance 


614 


bye Sst 99°¢ 9Lt 98°C 96¢ 90° StCb thb O9bF CLhH O8Fr 0S 975 99° C09 89°9 COL 901 
00'¢ 60°¢ 81 Loe Soe tre Ig¢e 99¢€ O8€ P6E cor vib 8¢b bP 69r >0S 9s's I¢°9 98°8 
6r'C gs? 19°C LOC tLe 6LC 887 S6¢ sot Sit Ice 6c Bet OS'¢ 99°¢ 68¢ ber BF 0¢'9 
tl? 81? CCC LOC Ie? Sec 6¢°C 9V'C toc 8609? s9e OL? 9L'C S8'c 96°C Ile pee ple 09° 
08'I e3'l 93°1 681 16'l r6'l 96'1 10° soc }3=—«.:d O'@ cl? cI? 61S 3 3=e?@ Ie? 6C'C CSC eL]e Ole 
sot OL't L8¢ L6O¢ LO'Y LIP Leb 9Fb b9b CH b6b 80S SCs SPS 6LS tc9 t69 618 LE Il 
LI¢ Ste PEt tht Is'¢ 6st 99°C (6-09 96 Olv 61b Ofb they Cr Y98'b Ics bls OL9 L06 
09°C 997% CLT BLE 8=— PBS 68°C S6c sO’ SI't STL Itt 6c t Bre 09° LLt OOP ctv LOv Iv'9 
It? SCC 3=— «OKT ve? BEC cy? 9P'C tSc 097 LOC IL‘? LL? t8°C c6'C tO'e 8Ie Ive [8 LO'b 
S31 881 06'1 t6'l 96'1 86 | 10°C soc )3=—s*OOL—“itéiHO ST 9c OC? tee 8CC Sec tye 99°C 9LC vie 
06°¢ Op CIP tcv tty try tsb CcLY l6b 60S O¢S SoS cos 9LS LO9 cso tol [S38 SLI 
9C'€ cSre St coe 8 OLE BLE 98 lOb 91b Oth 68h OSb Pb CBr 90 Ips S6s £69 €t'6 
cL? 6L°¢ S8C 16°C 96¢ cL LO't 8I'¢ 8 £ Lec pe Ig’ I9'¢ Loe 68°t cly Leb  Ols gs°9 
Occ PE? BEC tye Lv? ISc v8? c9'C 69°C SLCe = O87 S8°C lI6c OOF Ie 9¢'t 6rt 68'¢ SLY 
06'1 c6 1 96'1 66 1 loc v0? 90¢ OL? 4 61? | rn LOK 6 80 C tO? 6tC 8P'C 19°C 18° 81e 
tb bvep the SSbh Sb Or I8hr SOS P7S C@HS HES 89¢ 98° O19 e@H9 889 O9L 168 tcCl 
09'¢ 69'¢ BL 98e P6t crv Op Sth Ob Ph 9b bLb 68h LOS ces LYS t79 ICL $96 
88cC 66ST) 90t CIt Lit tee tee tye toe 6st 99% OLE 88 v0Pr 8c PV 9b 90S cL9 
Orc SV? 6r'C eS'c LS? 19°C SoC: CL? 6L°C S8c 067 S6C 10'¢ 60¢ O¢E OEE 6S'¢ 86¢ v8'b 
L6'l 00°C t0'¢ soc 80¢ =O @ cle LI? It? Sc?¢ Let OFC HEC 6£°C svc =—s «HSC 99°¢ 98°C tot 
v9 SLb 8 698'P L6¢v LOS LIS LoS Lys 99'S gBs L6S ¢cl9 O&F9 HE9 L839 Pel 80°8 tv'6 C8 Cl 
l6¢ O0¢v 807 LIb Scr tov lbp 99'P ILb S8b b6b 90S OCS 6S >H9'S 66'S SSO = OSL yoo! 
80¢ Prle O%e 9C¢ let Lee cre cSt cet cLe BL cst S6¢ LOb wvcP Lv’ tsp ors v69 
SC 8S 8609'S 990 OLC PLT LLC S8°C 16°C 86C COE Loc Pie CCL EEE Bre ILe OP 96°F 
90°C 80°C IV? el? 91? 8IC 8 =6—O0C CS 8CC ce? se? 8C°C Iv'c 9r'C CSC 19°? tLe COC 6c 
6S OFS Ips cos cys Ls e8s t0°9 tc9 Cho PS9 699 889 tll LyL 96L CL8 IT Ol 1I9'€l 
Iter Ory 8r'P Lop SOF tly ISb 96P IVS 9¢8¢ ces Lvs I9¢ O8S$ 909 cr9O 669 7c08 9S°0I 
tte 6tt svt Ist 9S't I9'€ Lot LL L8¢ 96'¢ tor Olb OCb CLh B8Fb CLH 808 ILS ICL 
IL? SLO 6L7 €8C 98°C 06 67 l0'¢€ Loe vit 1 €CE 6 Lee Bre oe 98°t 9¢'b cls 
91°C 81°? It? tc? Sc? 8CC }8=—COHTE—“Cti«‘aETP 8EC 8=— CHC VC Lv'c gc gcc 19° 69°C I8'c 10'¢ 9 
S6S 909 819 6¢9 OF9 O09 19°9 18°9 IO'L ICL vel OSL 69L S6L OF8 188 096 =6p0'TI 69°F I 
9b S6b ¢t0S cls = O¢s BCS 869t SES LOS 18's 16'S t09 8819 Le9 £9°9 OL 692 £98 9C' TT 
LO'¢ 3 BLE = bBE 68°¢ S6t O0b Ob OCb Ob BMH thP tSb S9b CBP sos crs 909 LOL 
t6C L6C I0¢ vO0t 80'¢ cle SEE: SCG 8 Ste 6cce phe OSt 8S 69 8 LOb 9 ces 
60 7C cee OES 9°C 800) 6=—l OR CCHS SC 860O0SCT OSM 98°C 6S°C coe LY? tL 18° c6C Ile OP't 
aed OcL 09 OP Of v2 07 SL cl OL 6 8 Z 9 S v € c L 
la 


(panujjuod) A 1aVL 


vi 


tl 


cl 


II 


Ol 


“fl 


Statistical Tables and Charts 


tL? 
OPC 
IV? 
L381 
c9'1 
18°? 
66 
9172 
06'1 
p9'l 
68°C 
89°C 
077 
£61 
Lol 
667 
99°C 
977¢ 
L6'l 
69'1 
Olt 
SLT 
CLC 
10° 
cL I 
CTE 
8°? 
8E°C 
90°C 
oa | 
Let 
967 
OPC 
IV? 
6L'1 
OcL 


8c 
co? 
81°? 
c6'1 
99° 
C67 
I9'¢ 
COT 
S61 
89'1 
00°C 
LOC 
LOC 
86 1 
OL'I 
Ole 
SLC 
CLC 
CO? 
cL 
IT€ 
87? 
BEC 
90°C 
SLI 
EES 
£67 
6 
Il? 
8L'1 
Bre 
SOE 
66 
91°? 
C81 
09 


S67? 
9°? 
ScC 
96'1 
69'1 
CO’ 
69°C 
67°C 
66'1 
ILI 
Il'€ 
9L7 
tee 
to? 
ell 
0c'e 
v8? 
8? 
90°¢ 
SLI 
Ice 
C67 
vr? 
OI? 
8L'1 
bre 
co’ 
IS’? 
SI? 
18] 
8S 
tle 
69°C 
0c? 
c3'l 
OV 


soe 
CLT 
It? 
107 
cL I 
(aus 
8L7C 
ce? 
O°? 
PL 
Ice 
y8'C 
6t°C 
LO? 
9L'I 
Oct 
C6 
vr? 
II? 
8L'1 
Ive 
O¢ 
Os’ 
cI? 
[3] 
pot 
Olt 
LSC 
617 
v8 
69'¢ 
Itt 
9°? 
St? 
L8'I 
Of 


Sit 
08° 
LEC 
soc 
SLI 
COE 
98°C 
Iv? 
807 
LL I 
Ice 
C67 
SVC 
IT? 
6L'1 
Ort 
00° 
OS’ 
SI? 
I8'| 
Ice 
80'¢ 
99°C 
617 
P8'l 
p9't 
8Ie 
97 
VCC 
L8'1 
6Lt 
67 
OL’? 
670°C 
06'1 
Zé 


PO’ 
88°C 
Che 
Ol? 
8L'1 
Cee 
v6? 
9P'C 
(as 
6L'1 
Ort 
00° 
6 
917% 
ISI 
OS't 
80'¢ 
99°C 
617 
r8'I 
I9'¢ 
9I€ 


COT | 


tc? 
98° 
tLe 
9¢@'€ 
89°C 
80 C 
68'1 
88'E 
LEX 
9L'C 
te? 
cé | 
02 


tre 
tO’ 
tS? 
81°C 
e381 
OS'€ 
60°¢ 
LS’@ 
077 
P81 
6S 
Sit 
C97 
tT? 
98°1 
89°C 
ewe 
LY? 
Lo? 
68'1 
6Lt 
Itt 
CLC 
I€'? 
16'l 
(43 
Ive 
6L7 
St? 
r6'1 
LOY 
(69 
98°C 
Orc 
L6'| 
SL 


09°¢ 
LS 
9°C 
STC 
L8'1 
89° 
tee 
89°C 
87 C 
68°1 
9L't 
Oc'€ 
CL @ 
Ie? 
16'| 
98°¢ 
Lee 
LL? 
pe? 
t6'1 
L6¢ 
ort 
C8 °C 
8t °C 
96'1 
Ol'P 
coe 
68°C 
(a6 
66'1 
SCV 
LOE 
96°C 
BPC 
COC 
cl 


LEY 
Ite 
tL? 
CLC 
cI 
S8 et 
Lee 
LLZ 
st? 
v6l 
c6t 
tye 
C87 
BEC 
96'1 
cO'V 
Ise 
L837 
Iv? 
86'1 
VIP 
6S't 
6 C 
Svc 
00°C 
Lov 
69°¢ 
66°C 
6r'C 
tO? 
(44 
O8't 
90°€ 
SC 
90°C 
OL 


88°¢ 
Ort 
08°C 
LETC 
c6 | 
96'¢ 
OV t 
8? 
6£ °C 
961 
vor 
(689 
88°C 
CHC 
86 | 
viv 
09°¢ 
t67C 
oP’? 
00°C 
CCP 
89° 
86°C 
6r'C 
£0? 
Bt P 
BLE 
sO'e 
pS 
90°C 
vS'P 
68°¢ 
(aes 
6S°7 
60°C 
6 


lA 


l0'v 
Ie 
L387 
Cy~ 
861 
60'P 
9S't 
167 
SV? 
00°C 
8IP 
tot 
967 
8PrC 
CO? 
80 P 
ILe¢ 
l0'¢ 
IS? 
v0'C 
6t 
6L¢ 
90°¢ 
go? 
90°¢ 
cSt 
68'°¢ 
cle 
69°C 
60°C 
LO'P 
00'F 
Oct 
p9'@ 
(axe 
8 


SIP 
p9't 
L6C 
6r'C 
co? 
9¢'P 
OLt 
10’ 
Isc 
POC 
rev 
ELX 
sot 
psc 
90°C 
bry 
8c 
Ole 
89°C 
80°C 
9S'P 
C6t 
9I€ 
19°C 
O17? 
69'°P 
tor 
CTE 
99°C 
el? 
c8P 
PIP 
67'¢ 
IL‘? 
91? 
Z 


6t P 
IS 
60'¢ 
LS‘? 
80°C 
Lv'v 
L8¢ 
tle 
09° 
60°C 
9S'P 
p6t 
LIt 
9°? 
IV? 
99'P 
lO'V 
CTE 
99°C 
tl? 
8L'P 
OI'P 
87 E 
OL? 
SI? 
lov 
0c 
Pee 
PL? 
81C 
LOS 
ce P 
Ipe 
6L7 
It? 
9 


89'P 
POV 
STE 
89°C 
PI? 
OLY 
Ol'v 
67 £ 
IL7@ 
917 
S8Y 
LIY 
3 Se 3 
PL 
812 
96'P 
SCP 
BEE 
LL? 
0c? 
LOS 
ve P 
bre 
187 
COC 
Its 
vbr 
Ost 
S8'7C 
PCC 
Les 
9S'P 
Boe 
06°C 
LGC 
S 


60°S 
LEP 
Bre 
8c 
tt7C 
LIS 
try 
Ice 
L387 
STC 
LoS 
OS'P 
9S 
06°C 
LOC 
Les 
8S 
1I9'¢ 
t67¢ 
67°C 
os's 
Lo’ 
99°¢ 
967 
Ie? 
9's 
LLY 
tle 
10'¢ 
te? 
08'S 
68'P 
O8'¢€ 
90'€ 
9¢°C 
v 


tls 
L8' 
CBE 
LOE 
9C°C 
C8 
pov 
98°¢ 
Ole 
BEC 
c6'S 
10's 
06'¢ 
ele 
Orc 
£09 
60°S 
S6t 
OTe 
cre 
919 
81's 
lov 
Oct 
pr? 
0t'9 
67'S 
80'P 
Poe 
9r'C 
8h9 
crs 
Slt 
67 € 
6r'C 
€ 


68°9 
8L'¢ 
cy Pp 
Lye 
LSC 
669 
g8s 
9P'P 
6r't 
6S °C 
60°L 
t6S 
Io p 
cSt 
19'¢ 
I@L 
109 
9S'P 
cst 
COC 
COL 
IT°9 
cov 
6S't 
9°? 
ISL 
tc9 
69'P 
tot 
LO‘? 
OL'L 
9¢°9 
LLY 
89'€ 
OL? 
Cc 


(panuljuod) A J1aV1 


I? 


0c 


61 


81 


LI 


91 


cl 


| 


The Analysis of Variance 


616 


SoC LET? 8h 7 697 69°77 6L7 68°C LO'¢ STL Ive coe coe I8'€ COP OC Pv OLY ces py9 87 6 
90°7 LI? 977 5 4 vy? CSC 09°C SLZ 06°72 cOt (aus tT€ 9C'€ cSt SLe LOY LS'Y crs POL 
c8'1 l6'l 86 1 SO? IV LIZ CTC ve T cr7~ cS7 197 69°C SLT 06°C 90°¢ 67 ¢ £9'e COP 19°¢ 
col ILI LUI C8 I L831 l6'l 961 v0'C (awe 61°72 POT 6C 7 97 cr7~ 997 IL'7@ S67 Peo’ OC P 
8r I cS I 9S I 6S 1 c9'l 99'1 69'I bli I 6L' I v8 I L831 06'1 61 00°C 90°C 917% 677 0S'7 68°7 
6C°C Ip? CST ¢9'C CL C87 £67 IT€ STE cre OS’¢ 69'¢ C8 et 90'P bey bL'P 9¢°S 6b9 ve'6 
Ol 077 67 C SOC LvT ccc £9°7 8L7C C67 90'¢ Cle’ 97 £ 6 £ Oct SLE II 09'P 6b Ss 89'L 
cs I £6 I 00°7 LOZ a ae 617 STC 9C7¢ Lv LS‘7 C97 ILZ@ 08°¢ COC 80'¢ Iee got vO £9'S 
L9'l eLil 6L'I 81 881 c6 I L6'1 90°C Cl 07 7 CCT le? Le? OVC Lo? CL? 96°C Cee’ ICP 
6r'l CCl LSI 09'1 r9'l L9'l OL | CLI 08'I C3 I L381 16 él 00°C LO'7@ LV? O¢ C IS’? 067 
COC cr ~ 997 L9T LLZ L&T LOT SI'¢ CCe 6b't 09'€ ele 68°C Ol SOP 6L'v Ips ps9 Iv'6 
el? {TC CCT Che OS'c 8ST 99°C 187 96°C 60'¢ 8I'¢ 67 € Cre 6S'¢ CBE viv v9O'P eos CLL 
881 C6 | c0'7 60°2 91'C COC 8o7C 6£°7 6r'7 6S 7 $9'C tL? C8 TC v6C Olt’ tee LO’ LOV 99'S 
691 CLI 08 I C8 I 061 C6 I 66 I LO? SI? CTC LTC CLC 67 Lv 6S'7 PLZ 86 7C Lee ttv 
OSI rS'I Bol 191 g9'l 89'1 IL I 9L'I [8 I 93'1 88'1 c6 I 96'1 107 80°¢ LV? le"? CST 167 
8E°C 0S 7 19°7 CLT C87 COT 1l0'€ 07 € Lee pS'¢ p9'e SL¢ b6'¢ CIP Cy y b8'P OP's 09°9 8P6 
LIZ L7Z OCC cVz PSC COT OLZ C87 667 Cle COL CEE Ort t9'€ C8 et SIP 89'P LG'S LLL 
161 86 1 cO'7¢ (awe 81? POC OC Ivzé IS'Z I9°Z 897 SL? S87 LOT ele Stet 69'¢ 6c 69'S 
ILI LL'I e831 L381 c6 1 961 107 60°C 917 vOT 877 ve 7 OV'z 607 09°7 OL'TC 66 7 6 £ vO PV 
cS I 9c I 6S I £91 99'l 691 CL I LL I 8 I L8 I 68 I £6 | L61 COT 60°7 87 CEC eS 7 (46 
tv~ cS? 99°72 LLZ@ L8C L67 90°€ C7 Ee CV et 6S t 69'¢ e8e 66 € Oc P 6r V 68 P css 99°9 cS'6 
I@7 Ie? Orc 6r 7 897 99°C PLT? 68°C cO'e LI 97 E OC '€ OS'¢ LOC 06'¢ (4 CLY 19'S C8 L 
v6 I 10'¢ 80°7 C172 I@7C LTC tec vY'C pS 7c r9'C OL 8L7 L387 66 7 Cle SEE CLE CoP CLS 
ell 6L I r38' 1 681 v6 I 86 I £07 IV'@ 87 S77 OC 9C 7 (Gx6 IS? C97 8L°7 10’ Ore 97 P 
Sil LSI I9'T v9" I LOI OL'I cL 8L I C81 881 l6'l v6 861 vO? Ol? 617 Co? PS 7 £67 
8P'7~ 09°C IL'7@ (4:06 COC COE cle Ore Lye p9O'e cle 88 e cO'Y 97 P pS Pp CoP gcg eL9 £96 
977 ce? cr ~ PSC COC OL SLT £67 LO¢ I7€ Ore Ive PS'¢ IL b6'¢ ICV OLY 99°¢ 88 L 
L6'1 v0? IV? 817 PET OCC 9¢ 7 Lvz LS? L9'7 ELZ 187 06°C CO’ St Ive CLt Sh 4 CLS 
OL'I Ig | 98° 1 16 1 96'l1 107 SOC C17 077 LE? CLC Le? vy 7 CST b9°C 08°7 cO'€ Cy’ SCV 
Sa | 6S I col 99'| 69'1 cL I bl I 081 38' I 681 c6 I C6 | 66 | SO’? LI? IO? be 7% cS~ v6 7 
cS 99°C LLZ 887 86 7C 80'¢ 8I'€ Of € pS'¢ OLE I8'¢ v6't I'v COP 1l9'P cO'S 9's 18°9 tL 6 
I~ OV'7~ OS'¢ 8ST LIT SLT C87 86:7 aus 9c € Ste cre 6S'¢ OL'¢ 66'¢ lev C8'P CLS S6L 
00°C 80°C PI? It? Lo CC7 60°C OSC 09°7 OL OL v8 7 £67 CO’ COL bre 8L't 8tP 6L'S 
SLi 8'I 681 v6 | 86 I £07 LOZ ST? CTC Oc 7 ve TC Orc 1G 6 99'7 C87 co’ prt OC P 
LSI 091 bol LOI OL'I ell 9L'I [8 I 981 06'I £6 1 L6'l 107 90°7 Cl? (cers Sec 99°C S67 
oe) OZL 09 OV 0c £4 0Z SL ral OL 6 8 Z£ 9 S 17 € v4 L 
4 


8¢ 


L7 


9¢ 


Sc 


ve 


tc 


(a6 


iy 


(panuljuod) A JIEVL 


617 


Statistical Tables and Charts 


‘uorsstuiad Aq pajyutiday */8-8/, 
(EP6LIEE “DYJawmolg ,UOrNgsIC (4) BAG PSWOAU] oY} Jo sJuIOg seJUDdIEg JO So[quL,, ‘UOsduIOY], ‘J ‘D pue uoJSULLIO|] ‘J WO, 


00'1 9¢°1 ec'l Lol 6L'I 06'1 00°C 617 9¢°C cS? C97 ple 06°@ 60'€ See cle 80 P Oe's 88°L $00" 
00°! cel Lvl 6S'1 OL 6L 1 88'1 P0'C 81? CEC Ive Ig? 9° 08° COE Cee BL et 19'p £9°9 010 
00'T Lol 6¢'1 8r'l LS'I 9'l ILI €3'1 v6'l c0'? ITZ 617 677 Iv't LO7@ 6LC Cle 69'¢ co's ¢cO" 
00'T cc cel 6t' | 9r'l (6a LS’ Lol 7 a c3'l 88'1 v6'l 107 O17 It? Le? 09°C 00° pst 0SO0 
00'T LI vel Ol rel stl cy 6rl = Sol 09°! tol Lol cL I LL I c8 1 v6 | 80°C Oc? ILZ@ 0 oo 
trl 19'T SLI L8'T 86'1 60°C 61°C LEC vS'C IL? 18°C t67 60°¢ 87 E sce COE OS'P pss 818 $00° 
sel ecl 99'1 9L'I 98'1 c6'l tO? 617 vee LVe 997 99° 6L7 967 LIt¢ 8ht S6t 6L Pb c39 010 
[el trl tcl 191 69'1 9L'1 e831 r6'1 sO’? 9I'¢ COC Ot? 67 (66 LIC 68°C tce O8't cis $70" 
STI cel trl os'| col I9°| 99'1 cll e831 161 96'1 CO? 60°7 LI 6707 crc 89°C Loe C6 0S0" 
6I'T 921 ce Le'l Iv'l Srl 8r' 1 cc'l 09'1 69'l 891 cL I LL I C81 06'1 66 1 tlhe Se? SL? 0) ra 
69'T e3'1 961 80°C 617 67°C 6C°7C LS‘? pL 06°7 I0'¢ ELE 67 € 6r't OL vir tly 6Ls 678 S00° 
09'T tL vs'l v6'l t0'7C (a6 07'C St? OS°¢ t9'C CLT C8'C S6¢ cle vee core ely 867 80°L 010 
8r'l 8S'l LOI Pll C81 881 r6'1 90°C LI? Lo? te? Ivc Ig‘? t9'C 6L'7 10'€ vee t6t 67'S ScO 
6t 1 Lvl eo'l 6ST g9'l OL'l SLI p8'1 C61 66'| P07 O17 LV Sc? Le? tS? 9L'C ct 00° 0SO" 
6c 1 cel Or'l vr'l 8r'l I¢'T ps'l 09'T 99'1 IZ'l pl i LL I C8 L8'1 s6'l v0? 817 6C°C 6L°C 0) 09 
£61 90°C 81'c O£'7 Ov’? Os’ 09°C 8L 7 S67 cle’ CTE cee IS'¢ ILt 66¢ Ley 861 LO9 €8°8 £00" 
081 c6'1 COC ITZ 0c? 677 LEC 66 997 08°¢ 687 667 cle 67 € IS’ ese lt'p 8s lel 010 
v9! CLI 08'T 88'1 r6'l 107 LOC 8I'c 67°C 6t°C 6 eS CITC PLT 06°C ele Ort sor crs $cO 
IST 8S'l v9'l 69'1 Pl 6L'I P81 C61 00°C 807 (aKé 81? STC Pet svc 19% 8c tce 80P 0SO" 
sel cr Lvl IS'T rs'l LSI 19° 99'T IL I 9L'1 6L'1 es 1 L831 t6'l 00°C 60° tt? rr? 8c 0) Ov 
817 Ot? (6x6 CST 97 tL? C8°C I0'€ BI pe’ crt 8S¢ ple S6t ttv Cob pes cc9 816 $00° 
107 Ilz@ IZ? Oc? 6tC Lv7z co? OL’? p8°C 86°C LO’ Lie Oc’ Lye OL't cO'v IS"p 6's 9C°L 010 
6L'1 L831 b6'l 107 LOC PI? 0c'C Ie? Iv’? Ig’? LS7@ co? SLT L8C tO'e Ste 6S'¢ SIP LS’S $cO" 
col 89'1 vl 6L' | 8 I 681 t6 1 10'¢ 60°C 91? It? LTT Lec CVC tS? 69°C COC cee LIv 0SO 
9r'l OST ps LS'I 191 p9'l Lo’ cL I LL C81 cs 88'1 t6'l 861 sO’? pi? 877 6r'C 88°C 0) 3 
I? te? sv? 9S°¢ 99°C 9L7 98° PO'e Itt Bee Bre I9'€ LLe 86¢ 9b 99'b 8cS Or9 £76 $00" 
07? PI? tT? te? Ivé 6b Lo? tL? L387 00° 60'¢ Oct toe OS'€ ele v0? PSP cys 09°L 010° 
I3'T 68'1 96'1 £07 60°C cI 17? CEC aa toc 697 Ly? OL 'C 83°C vo’ L7e 19’¢€ Oc Pr 6¢'°S SCO" 
v9'l OL'T SLI 18°1 C31 06'1 v6'1 tO? O17 812 COC 80 7C ce? ty? SG OL? £67 tee I 0S0° 


Lvl ISI col 8c [ col col 89'1 cll SLI c3'1 98'T 681 c6'l 66 I 90°C cI? 877 OS'C 68°C oor 6C 
00 OcL 09 OV 0¢ Z4 072 SL cl OL 6 8 Z 9 S v € vA L 0 “1 
4 


(panuljuod) A JIAVL 


618 The Analysis of Variance 


Table VI. Power of the Student’s t Test 


This table gives the values of the noncentrality parameter 6 of the noncentral 
t distribution with degrees of freedom v = 1 (1) 30, 40, 60, 100, 00; one- 
tailed level of significance a = 0.05, 0.025, 0.01; and the power = 1 — B = 
0.10 (0.10) 0.90, 0.95, 0.99. Since the distribution of t 1s symmetrical about 
zero, the one-tailed levels of significance also represent two-tailed values of 
a = 0.10, 0.05, and 0.02. The table can be used to determine the power of a 
test of significance based on the Student’s ¢ distribution. For example, the power 
of the t test corresponding to v = 30, 6 = 3.0, and aw = 0.05 is approximately 
equal to 0.90. 


a = 0.05 
Power = 1— 
Vv 0.99 0.95 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 


] 16.47 12.53 10.51 8.19 6.63 5.38 4.31 3.35 2.46 1.60 .64 
2 6.88 5.52 4.81 3.98 3.40 2.92 2.49 2.07 1.63 1.15 50 
3 5.47 4.46 3.93 3.30 2.85 2.48 2.13 1.79 1.43 1.02 .46 
4 4.95 4.07 3.60 3.04 2.64 2.30 1.99 1.67 1.34 .96 43 
5 4.70 3.87 3.43 2.90 2.53 2.21 1.91 1.61 1.29 7 42 
6 
7 
8 
9 


4.55 3.75 3.33 2.82 2.46 2.15 1.86 1.57 1.26 .90 4) 


4.45 3.67 3.26 204 2.41 2.11 1.82 1.54 1.24 89 .40 

4.38 3.62 3.21 2.73 2.38 2.08 1.80 1.52 1.22 88 .40 

4.32 3.58 3.18 2.70 2.35 2.06 1.78 1.51 1.21 87 .39 
10 4.28 3.54 3.15 2.67 239 2.04 1.77 1.49 1.20 .86 .39 
1] 4.25 3.52 3.13 2.65 2.31 2.02 1.75 1.48 1.19 .86 39 
12 4.22 3.50 3.11 2.64 2.30 2.01 1.74 1.47 1.19 85 .38 
13 4.20 3.48 3.09 2.63 229 2.00 1.74 1.47 1.18 85 .38 
14 4.18 3.46 3.08 2.62 2.28 2.00 1.73 1.46 1.18 84 .38 
15 4.17 3.45 3.07 2.61 2.27 1.99 1.72 1.46 1.17 84 .38 
16 4.16 3.44 3.06 2.60 2.27 1.98 1.72 1.45 17 84 .38 
17 4.14 3.43 3.05 2.59 2.26 1.98 1.71 1.45 17 84 .38 
18 4.13 3.42 3.04 2.59 2.26 1.97 1.71 1.45 16 83 .38 


19 4.12 3.41 3.04 2.58 2.25 1.97 1.71 1.44 
20 4.12 3.41 3.03 2.58 229 1.97 1.70 1.44 


21 4.11 3.40 3.03 2.57 2.24 1.96 1.70 1.44 
22 4.10 3.40 3.02 2.57 2.24 1.96 1.70 1.44 


16 83 .38 
.16 .83 .38 


.16 83 .38 
16 83 37 


— — — — — 
. . 


ee ee ey 


23 4.10 3.39 3.02 2.56 2.24 1.96 1.69 1.43 15 .83 37 
24 4.09 3.39 3.01 2.56 225 1.95 1.69 1.43 15 83 37 
25 4.09 3.38 3.01 2.56 2.25 1.95 1.69 1.43 15 .83 37 


26 4.08 3.38 3.01 2.55 2.23 1.95 1.69 1.43 1.15 82 37 
27 4.08 3.38 3.00 2.55 2.23 1.95 1.69 1.43 1.15 82 37 
28 4.07 3.37 3.00 2.55 ae 1.95 1.69 1.43 1.15 82 37 
29 4.07 3.37 3.00 2.55 2,22 1.94 1.68 1.42 1.15 82 37 
30 4.07 3.37 3.00 2.54 2.22 1.94 1.68 1.42 1.15 82 37 


40 4.04 3.35 2.98 2.53 2.21 1.93 1.67 1.42 1.14 82 37 
60 4.02 3.33 2.96 Zio2 2.19 1.92 1.66 1.4] 1.13 81 37 
100 4.00 3.31 2.95 2.50 2.18 1.91 1.66 1.40 1.13 81 37 
oo 3.97 3.29 2.93 2.49 2.17 1.90 1.64 1.39 1.12 .80 36 


Statistical Tables and Charts 


TABLE VI (continued ) 


0.99 


32.83 
9.67 
6.88 
5.94 
5.49 


X22 
5.06 
4.94 
4.85 
4.78 


4.73 
4.69 
4.65 
4.62 
4.60 


4.58 
4.56 
4.54 
4.52 
4.51 


4.50 
4.49 
4.48 
4.47 
4.46 


4.46 
4.45 
4.44 
4.44 
4.43 


4.39 
4.36 
4.33 
4.29 


0.95 


24.98 
VAT 
5.65 
4.93 
4.57 


4.37 
4.23 
4.14 
4.07 
4.01 


3.97 
3.93 
3.9] 
3.88 
3.86 


3.84 
3.83 
3.82 
3.80 
3.79 


3.78 
3.77 
3.77 
3.76 
3.75 


3.75 
3.74 
3.73 
3.73 
3.73 


3.69 
3.66 
3.64 
3.60 


0.90 


20.96 
6.80 
5.01 
4.40 
4.09 


3.9] 
3.80 
3.7] 
3.65 
3.60 


3.57 
3.54 
3.51 
3.49 
3.47 


3.46 
3.44 
3.43 
3.42 
3.4] 


3.40 
3.39 
3.39 
3.38 
3.37 


3.37 
3.36 
3.36 
3.35 
3.35 


3.32 
3.29 
3.27 
3.24 


0.80 


16.33 
5.65 
4.26 
3.76 
3.51 


3.37 
3.27 
3.20 
3.15 
3.11 


3.08 
3.05 
3.03 
3.01 
3.00 


2.98 
297 
2.96 
2.95 
2.95 


2.93 
2.93 
2:93 
2.92 
2.92 


2.92 
2.9] 
2.90 
2.90 
2.90 


2.87 
2.85 
2.83 
2.80 


a = 0.025 

Power = 1 — 
0.70 0.60 
13.2] 10.73 
4.86 4.2] 
3.72 3.28 
3.3] 2.93 
3.10 2.75 
2.98 2.64 
2.89 2.57 
2.83 202 
2.79 2.48 
2.75 2.45 
2.73 2.43 
2.70 2.41 
2.69 2.39 
2.67 2.38 
2.66 2.37 
2.65 2.36 
2.64 2.35 
2.63 2.34 
2.61 2.33 
2.61 2.33 
2.60 232 
2.60 2,32 
2.59 2.31 
2.59 2.31 
2.58 2.30 
2.58 2.30 
2.58 2.30 
2.57 2.29 
2.57 2.29 
2.57 2.29 
2.55 2.21 
2.53 2.25 
2.51 2.23 
2.48 2.21 


0.50 


8.60 
3.63 
2.87 
2.58 
2.43 


2.34 
Zo) 
2.23 
2.20 
2.17 


2.15 
2.13 
2.12 
2.11 
2.09 


2.09 
2.08 
2.07 
2.06 
2.06 


2.05 
2.05 
2.05 
2.04 
2.04 


2.04 
2.03 
2.03 
2.03 
2.02 


2.01 
1.99 
1.98 
1.96 


0.40 


6.68 
3.07 
2.47 
2.23 
2.11 


2.03 
1.98 
1.94 
1.91 
1.89 


1.87 
1.85 
1.84 
1.83 
1.82 


1.81 
1.81 
1.80 
1.80 
1.79 


1.79 
1.78 
1.78 
1.78 
1.77 


1.77 
1.77 
1.77 
1.77 
1.76 


1.75 
1.73 
1.73 
1.7] 


0.30 


4.9] 
2.50 
2.05 
1.86 
1.76 


1.70 
1.66 
1.63 
1.60 
1.59 


1.57 
1.56 
1.55 
1.54 
1.53 


1.53 
1.52 
1.52 
1.5] 
1.51 


1.50 
1.50 
1.50 
1.50 
1.49 


1.49 
1.49 
1.49 
1.48 
1.48 


1.47 
1.46 
1.45 
1.44 


iy iy —y — —" 
. . . . . 


Se ee 
. . . . 


619 


620 


TABLE VI (continued ) 


The Analysis of Variance 


a= 0.01 
Power = 1 —G 

v 0.99 0.95 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 
] 82.00 62.40 52.37 40.80 33.00 26.79 21.47 16.69 12.27 8.07 4.00 
2 15.22. 12.26 10.74 8.96 7.73 6.73 5.83 4.98 4.12 3.20 2.08 
3 9.34 7.71 6.86 5.87 5.17 4.59 4.07 3.56 3.03 2.44 1.66 
4 Ta2 6.28 5.64 4.88 4.34 3.88 3.47 3.06 2.63 2.14 1.48 
5 6.68 5.62 5.07 4.40 3.93 3.54 3.17 2.81 2.42 1.98 1.38 
6 6.21 5.25 4.74 4.13 3.70 3.33 2.99 2.66 2.30 1.88 1.32 
7 5.91 5.01 4.53 3.96 3.55 3.20 2.88 2.56 2.22 1.82 1.27 
8 5.71 4.85 4.39 3.84 3.44 3.11 2.80 2.49 2.16 1.77 1.24 
9 5.56 4.72 4.28 3.75 3.37 3.04 2.74 2.43 2.11 1.74 1.22 
10 5.45 4.63 4.20 3.68 3.31 2.99 2.69 2.39 2.08 1.71 1.20 
1] 5.36 4.56 4.14 3.63 3.26 2.94 2.65 2.36 2.05 169 1.18 
12 5.29 4.50 4.09 3.58 3.22 2.91 2.62 2.33 2.03 1.67 1.17 
13 5.23 4.46 4.04 3.55 3.19 2.88 2.60 2.31 2.01 1.65 1.16 
14 5.18 4.42 4.01 3.51 3.16 2.86 2.57 2.29 1.99 164 1.15 
15 5.14 4.38 3.98 3.49 3.14 2.84 2.56 2.28 1.98 1.63 1.14 
16 5.11 4.35 3.95 3.47 3.12 2.82 2.54 2.26 1.97 162 1.14 
17 5.08 4.33 3.93 3.45 3.10 2.80 2.53 225 1.96 1.61 1.13 
18 5.05 4.31 3.91 3.43 3.09 2.79 2.52 2.24 1.95 160 1.13 
19 5.03 4,29 3.89 3.42 3.07 2.78 2.50 2.23 1.94 160 1.12 
20 5.01 4.27 3.88 3.40 3.06 2.77 2.50 2.22 1.93 159 1.12 
21 4.99 4.25 3.86 3.39 3.05 2.76 2.49 2.22 1.92 159 1.11 
22 4.97 4.24 3.85 3.38 3.04 2.75 2.48 2.21 1.92 1.58 1.11 
23 4.96 4.23 3.84 3.37 3.03 2.74 2.47 2.20 1.91 1.58 1.11 
24 4.94 4.22 3.83 3.36 3.02 213 2.47 2.20 1.91 1.57 1.11 
25 4.93 4.20 3.82 3.35 3.02 2.73 2.46 2.19 1.90 1.57 1.10 
26 4.92 4.19 3.81 3.34 3.01 272 2.45 2.19 1.90 1.57 1.10 
27 4.91 4.19 3.80 3.34 3.00 2.72 2.45 2.18 1.90 156 1.10 
28 4.90 4.18 3.79 3.33 3.00 2.71 2.44 2.18 1.89 1.56 1.10 
29 4.89 4.17 3.79 3.32 2.99 2.71 2.44 2.17 1.89 1.56 1.10 
30 4.88 4.16 3.78 3.32 2.99 2.70 2.44 2.17 1.89 1.55 1.09 
40 4.82 4.11 3.74 3.28 2.95 2.67 2.41 2.15 1.86 1.54 1.08 
60 4.76 4.06 3.69 3.24 2.92 2.64 2.38 242 1.84 1.52 1.07 
100 4.72 4.03 3.66 3.21 2.89 2.62 2.36 2.10 1.83 1.51 1.06 
fore) 4.65 3.97 3.61 3.17 2.85 2.58 2.33 2.07 1.80 1.48 1.04 


From D. B. Owen, “The Power of Student’s t Test,’ Journal of the American 
Statistical Association, 60 (1965), 320-333. Reprinted by permission. 


Statistical Tables and Charts 621 


Table VII. Power of the Analysis of Variance F Test 


This table gives the values of type II error (8) of a test of significance based on 
the F distribution corresponding to the numerator degrees of freedom v; = 1 
(1) 10 (2) 12; denominator degrees of freedom v2 = 2 (2) 30, 40, 60, 120, 00; 
standardized noncentrality parameter @ = 0.5 (0.5) 1.0 (0.2) (2.2) (0.4) 3.0; 
and the level of significance a = 0.01, 0.05, 0.1. For example, the power of 
the F test corresponding to vy = 3, vz = 30, @ = 1.4, and a = 0.05 is equal to 
1 — 0.4182 = 0.5918. To obtain power for odd values of v2 a linear interpolation 
in the reciprocal of v2 may be used, which generally gives three-decimal-place 
accuracy. To obtain power for values of ¢, not given in the table (0.5 < @ < 3.0), 
a three-point Lagrangian interpolation may be used, which generally gives an 
accuracy of at least two decimal places. For ¢ > 3, the values of power are 
mostly close to one. 


a =0.01 
V2 Oz 5 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 
yyA= 
2 9851 .9705 .9620 9521 .9408 9282 .9143 .8991 .8654 8277 
4 .9809 .9492 .9280 9012 8682 .8292 .7843 7341 .6216 5014 
6 .9782 .9340 .9030 .8629 8131 7541 .6870 .6136 4589 3125 
8 .9764 9236 .8859 .8367 717159 .7043 .6242 5387 3678 2211 
10 9752 9163 .8738 .8184 .7501 .6704 5824 4904 3136 .1725 
12 9743 9109 .8650 .8050 7314 .6462 .5532 4574 .2787 .1437 
14 .9736 .9068 .8582 .7949 7174 6283 5318 4336 2547 .1250 
16 .9730 .9036 .8529 .7870 .7066 .6145 5156 4158 .2374 1121 
18 9726 .9010 .8487 .7807 .6979 .6036 5028 .4020 .2243 .1027 
20 .9723 8989 8452 .7755 .6908 5947 4925 3910 2141 .0957 
22 .9720 8971 .8423 7712 .6850 5874 .4841 3820 .2060 .0902 
24 9717 8956 .8398 .7675 .6801 5813 4771 .3746 .1994 0858 
26 9715 8943 .8377 .7644 .6758 .5760 4712 3683 .1938 .0822 
28 9713 8931 .8359 7617 .6722 5716 .4661 .3630 .1892 .0792 
30 9711 8922 8343 .7593 .6690 5677 .4617 3584 .1852 .0767 
40 9705 .8886 .8285 .7509 .6578 5539 .4462 .3424 .1718 .0683 
60 .9699 8850 .8226 .7423 .6463 5401 4308 3267 .1590 .0608 
120 .9693 8812 .8165 .7335 .6347 5261 .4155 3113 .1468 .0539 
ere) .9687 8773 .8102 .7244 .6229 5120 .4003 .2962 .1354 .0478 
a 2 
2 .9863 .9753 .9688 .9613 9527 .9430 .9323 .9207 8945 .8650 
4 9828 9567 .9386 9153 8862 8511 .8100 7635 6571 5401 
6 .9803 .9409 .9118 .8730 .8237 -7640 .6951 6191 .4576 3052 
8 .9784 .9288 .8910 .8401 .7754 .6982 .6110 .5182 3358 .1869 
10 .9770 .9196 8751 .8150 .7393 .6500 5515 .4498 .2626 .1268 
12 .9760 .9124 .8627 .7957 7118 .6142 5085 .4022 .2163 .0934 
14 .9752 .9067 .8529 .7806 .6905 5869 .4765 3678 .1854 .0733 
16 9745 9021 8450 .7684 .6736 5655 4519 .3420 -1636 .0603 
18 .9740 8983 8386 .7585 .6600 5485 4326 3221 .1476 0513 
20 9735 8951 8331 .7502 .6486 5345 .4170 3063 .1354 .0449 
22 9731 8924 .8285 .7433 .6392 5229 4042 .2936 .1260 .0401 
24 .9728 8901 .8246 .7373 6312 5132 3936 .2830 .1184 .0364 
26 9725 8881 8212 .7322 .6243 5048 .3845 .2742 .1122 .0335 
28 9723 8863 8182 7277 .6182 4976 .3768 .2667 .1070 .0312 
30 9721 8848 8156 .7238 .6130 4914 .3701 .2603 .1027 .0293 
40 9713 8791 .8060 -7096 5943 .4693 .3468 .2382 0885 .0233 
60 .9704 8731 .7960 .6948 5749 .4469 3237 .2170 .0757 .0183 
120 .9695 8668 .7854 .6794 5551 4244 3011 .1968 .0643 .0143 
oe) .9686 .8600 .7743 .6634 5349 4019 .2789 .1776 .0543 .0111 
4 = 3 
2 .9867 .9769 9711 .9644 .9567 9481 .9385 .9280 .9045 .8779 
4 9835 9592 9421 .9199 8919 8580 8181 .7726 .6678 5517 
6 .9809 9427 .9136 .8742 8237 -7620 -6906 .6117 4448 .2899 
8 .9790 9291 .8896 .8357 1665 6835 .5902 4917 3032 .1576 
10 9775 9181 .8703 .8047 7214 5234 5166 -4085 2191 .0941 
12 .9763 .9093 .8547 -7800 .6861 5776 .4625 3504 .1675 .0615 
14 9753 9021 .8419 -7600 .6580 542] .4220 3086 .1343 0434 
16 .9746 8961 8314 .7437 .6354 5142 3910 .2776 1118 .0325 
18 .9739 8910 .8227 -7302 .6169 4917 .3666 .2540 .0958 .0256 
20 9734 8868 8152 7188 .6016 4733 347) 2355 0841 0209 
22 .9729 8831 .8089 .7092 5886 4580 3311 .2207 0752 0175 
24 9725 .8799 .8034 .7008 5776 4451 3178 .2087 .0683 0151 
26 9721 8772 .7986 .6936 5681 .4341 .3066 .1987 .0628 .0132 
28 9718 .8747 .7944 .6873 5599 4247 .2971 .1903 .0584 0118 
30 9716 8725 -7906 .6817 5526 .4164 .2889 .1831 .0547 .0107 
40 .9705 8645 .7769 .6614 5266 3873 .2606 .1590 .0430 .0074 
60 .9694 8558 .7622 .6400 .4997 3582 .2332 .1367 0334 .0050 
120 .9682 8464 .71464 .6175 4721 3292 .2070 .1163 0255 .0033 


oe) .9669 8361 .7295 5938 4439 3005 1821 0978 .0192 .0022 


622 


TABLE VII (continued) 


The Analysis of Variance 


a= 0.01 
V2 @=.5 1.0 1.2 1.4 1.6 1.8 2.0 22 2.6 3.0 
yy, =4 
Z .9869 9777 9723 .9660 9587 .9506 .9416 9317 .9096 8844 
4 9838 .9604 9438 9221 8946 8612 8217 7767 6725 5566 
6 9812 9433 .9139 8738 8219 7585 .6848 .6036 4330 .2767 
8 9792 9284 8873 .8306 £7575 .6697 5716 4691 2776 1363 
10 9776 .9160 8650 .7944 .1047 5996 4885 3745 1867 .0726 
12 9763 .9056 8464 .7647 .6622 5452 4236 3087 .1330 0424 
14 9752 8969 .8309 .7403 6281 5027 3765 .2620 .0998 .0270 
16 9743 8896 8178 .7200 .6003 4691 3405 .2279 .0783 0184 
18 9736 8834 .8068 .7030 5774 4420 3124 .2023 .0636 0133 
20 .9730 .8780 .1974 .6886 5583 4199 .2901 1825 .0533 .0101 
22 9724 8734 7892 .6763 5421 4015 .2719 1670 0457 .0079 
24 9719 8693 .7821 .6656 5283 .3861 .2570 1545 .0400 .0064 
26 9715 8657 1759 .6563 5164 3730 .2445 1442 .0355 0054 
28 9711 8625 .7704 6482 .5060 3617 .2340 1357 .0320 .0046 
30 .9708 8597 7655 .6409 .4969 3519 .2249 1286 0292 .0040 
40 .9695 849] 7473 6145 4643 3176 1942 1051 .0207 .0023 
60 .9682 8374 7275 5864 4306 .2836 1653 0844 0143 0013 
120 .9666 8245 .7060 5566 .3962 .2504 .1386 .0665 .0096 .0007 
oe) .9649 8103 .6828 5253 3614 2185 1144 0513 .0063 .0004 
yy= 5 
2 .9870 9782 .9730 .9669 .9600 9521 9435 9340 9126 8884 
4 .9840 .9611 9448 9233 8961 8629 8237 .7789 .6749 5591 
6 9814 9435 9138 8729 8199 .7550 6795 5965 4231 .2663 
8 .9793 .9276 8850 8258 £7494 6578 5559 4504 2575 1207 
10 .9776 9138 .8600 .7852 .6899 5792 4615 3471 1625 058 | 
12 .9762 9021 8387 7510 .6413 5174 3914 2757 1084 .0306 
14 .9750 8920 8206 .7224 .6017 .4690 339] 2257 .0763 0176 
16 .9740 8835 8052 .6984 5692 4306 .2994 1898 0564 0109 
18 9732 8761 .7920 .6782 5423 3998 .2688 1633 0435 0073 
20 9725 8696 .7806 .6609 5198 3746 .2446 1433 0347 .005 1 
22 9718 .8640 .7706 .6460 5007 3538 2252 1278 0284 .0037 
24 9713 8590 7619 .6330 4844 .3363 .2093 1156 0239 0029 
26 .9708 8546 £71543 6217 .4704 3216 1962 .1057 0205 .0023 
28 .9704 8507 .7474 6118 4581 .3089 1851 .0976 0179 0018 
30 9700 8472 7413 6029 4474 .2980 1758 .0909 0158 0015 
40 9685 8339 .7186 5705 .4090 .2601 .1446 0695 .0100 .0007 
60 .9668 8189 .6935 53359 .3696 2232 1163 0516 .0061 .0003 
120 .9650 8023 .6660 4992 3298 1882 0913 0372 .0035 0002 
oe) .9628 7835 .6360 .4606 .2901 1556 0699 0259 0020 .0000 
y= 
ps .9871 9785 9735 .9675 .9608 9532 9447 .9355 9147 8910 
4 .9841 .9616 9454 9241 .8971 .8640 8250 7802 6764 5605 
6 9815 9435 9135 8720 8180 7518 .6749 5905 4149 .2578 
8 .9794 9267 8826 8216 .7424 6477 5428 4351 .2417 .1090 
10 .9776 9118 8556 .1770 .6772 618 4406 3248 1442 .0480 
12 .9761 8988 8318 .7388 .6230 4938 3647 .2492 .0905 .0230 
14 .9748 8876 8113 .1064 5785 4402 3083 1971 .0600 0120 
16 9737 8778 .7936 .6790 5418 3978 .2659 .1604 0419 .0068 
18 .9728 8692 7783 .6556 5113 3639 .2335 1339 .0306 0042 
20 .9720 8617 .7650 .6356 4857 .3363 .2082 1142 0232 .0027 
22 9713 8551 7533 .6183 4641 .3136 1881 0992 0182 0019 
24 .9707 8493 .7430 .6032 .4456 .2946 .1719 .0876 0147 0013 
26 9701 844] .7339 5900 4297 .2787 1586 .0784 0121 .0010 
28 .9696 .8394 .7258 5783 4158 2651 .1476 .0710 0102 .0008 
30 .9692 8351 7185 5679 4037 .2533 .1383 .0649 0088 .0006 
40 .9675 8191 6911 5299 3605 .2133 .1080 .0462 0049 .0002 
60 9655 .8008 .6607 4891 .3166 1753 .08 16 0315 .0026 .0001 
120 .9633 .7800 .6271 4459 .2728 .1402 .0595 .0205 .0013 .0000 
0° .9607 .7563 5901 .4009 .2301 1089 0417 0128 .0006 .0000 
4, = 7 
2 .9872 9787 .9738 .9680 .9614 .9539 .9456 .9365 9161 8929 
4 9842 .9619 9459 9247 8977 8648 8258 7811 .6773 5613 
6 .9816 .9435 9132 8711 8163 .7490 .6710 5854 4082 .2510 
8 9794 .9260 8810 8179 .7363 .6390 5317 4224 .2290 .1000 
10 9775 .9100 8516 1699 .6661 5469 4231 .3065 1299 .0407 
12 .9760 8959 8256 .7280 .6070 4735 3423 .2278 .0770 0178 
14 .9746 8835 8029 .6922 5581 4156 .2828 .1744 0482 .0085 
16 9735 8726 7831 6615 5176 3698 2385 1375 0319 0044 
18 .9724 8629 7658 .6353 4840 .3333 .2049 1113 0221 0025 
20 .9716 8544 .7506 6127 4558 .3038 1791 0923 .0160 0015 
22 .9708 8469 1373 5931 4319 .2796 1587 0781 0120 .0010 
24 .9701 8401 7254 5760 4115 .2596 1425 .0673 .0093 .0006 
26 9695 834] 7149 5610 3940 2429 1294 .0589 .0074 .0005 
28 .9689 8287 £7055 5477 .3787 .2286 .1186 0522 .0060 .0003 
30 .9684 8237 .6971 5359 3654 .2165 .1096 .0469 0050 .0002 
40 .9665 8048 .665 1 4926 3182 1754 0810 .0309 0025 .0001 
60 .9642 .7830 6294 4462 .2709 1375 0572 .0193 0011 .0000 
120 .9616 .7580 5896 3973 .2246 .1038 0384 0112 .0005 .0000 
ove) 9585 .7290 5456 3467 1807 .0751 0244 .0062 0002 .0000 


Statistical Tables and Charts 623 


TABLE VII (continued ) 


a = 0.01 
V2 o=. 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 

i 8 
2 .9872 .9789 .9740 .9683 .9618 9544 .9463 .9373 9172 8943 
4 .9843 9621 .9462 9251 8982 8653 .8264 7817 .6779 5619 
6 9817 9435 9128 .8703 8148 -7467 .6676 811 4026 2454 
8 .9794 9252 .8793 8147 7311 6315 5223 AIT .2186 .0928 
10 9775 .9084 8481 .7635 .6564 5341 4082 2912 .1185 0352 
12 9758 8933 8201 -7184 5930 4559 3234 2101 .0668 0142 
14 .9744 8798 .7953 .6794 5401 3943 .2614 .1560 .0396 .0063 
16 .9732 8678 7735 .6458 4963 3458 .2157 1193 0248 .0030 
18 9721 8571 .7543 .6169 4598 3072 1815 .0938 0163 .0016 
20 9711 8477 .7374 5919 4292 .2762 .1554 .0756 0113 .0009 
22 .9703 8392 1224 5702 4034 .2510 .1352 .0624 .008 1 0005 
24 .9695 8316 7091 5512 3814 .2302 .1193 0524 .0060 .0003 
26 .9689 8247 .6972 5345 3625 .2129 .1065 0449 .0046 .0002 
28 9682 8185 .6866 5198 3461 .1983 .0962 0389 .0036 .0001 
30 .9677 8129 .6770 5067 3318 1859 .0876 0343 .0029 .0001 
40 9655 7911 .6406 4584 2815 .1447 0611 .0209 0012 .0000 
60 .9630 7658 5995 .4070 .2318 .1079 .0402 0118 .0005 .0000 
120 .9600 .7362 .5536 3531 1843 .0765 0247 .006 1 .0002 .0000 
ore) 9563 .7016 5028 2981 .1406 0512 0141 .0029 .0000 .0000 

Yyy= 9 
2 .9872 9790 .9742 .9686 .9621 9549 .9468 .9380 9181 8954 
4 .9843 9623 .9464 9254 8986 8657 .8268 .7821 .6783 5623 
6 9817 9434 9125 8696 8135 .7446 .6647 5774 3978 .2407 
8 .9794 .9246 .8778 8119 7265 6251 5142 -4025 .2099 .0871 
10 .9774 .9070 8450 .7579 .6479 5229 3953 .2782 .1093 .0310 
12 9757 8909 8151 -7098 5806 4407 3073 1955 0587 .0116 
14 .9742 8764 .7884 .6679 5242 3759 .2433 .1410 0331 .0047 
16 9729 8634 .7647 .6316 A774 3249 1966 .1047 .0197 .0021 
18 9718 8518 .7438 -6002 4384 .2847 1621 -0800 0124 .0010 
20 .9708 8414 £7252 .5730 4057 2525 1361 .0628 .008 1 .0005 
22 .9698 .8320 .7086 .5493 .3782 .2265 .1162 .0504 .0056 .0003 
24 .9690 8235 .6939 5286 3548 2053 .1007 0414 .0040 .0002 
26 .9683 8159 .6807 5104 .3347 .1877 0885 .0346 .0029 .0001 
28 .9676 .8090 .6689 4943 3174 .1730 .0786 0294 0022 .0001 
30 .9670 .8026 -6581 .4799 3023 .1606 .0706 0254 .0017 .0000 
40 .9646 .7780 .6174 4273 .2497 1199 .0464 0143 .0006 .0000 
60 .9618 .7490 5712 3714 .1986 .0848 0283 .0073 .0002 .0000 
120 9583 .7148 5194 3133 -1508 0562 0158 .0033 .0001 .0000 
oe) 9542 -6745 .4620 .2549 1085 .0345 .0080 .0014 .0000 .0000 

y= 10 
2 .9873 9791 9744 .9688 .9624 9552 .9472 9385 9188 8963 
4 9844 .9625 .9466 .9256 .8989 .8660 8271 £7825 .6786 5625 
6 9817 9433 9123 .8690 8124 .7428 .6622 5742 3938 .2367 
8 .9794 .9240 .8765 .8094 7225 .6195 5072 3947 .2026 0823 
10 .9774 9057 8422 .7529 .6404 5131 3842 .2672 1017 0277 
12 .9756 8888 .8106 .7022 5696 4273 .2935 1831 0523 .0097 
14 9741 8734 -7822 .6575 5101 3597 .2279 1285 0282 .0037 
16 9727 8594 7567 .6187 4605 .3068 -1805 .0928 .0160 .0015 
18 9715 8469 .7341 5850 4193 .2652 .1459 .0690 .0096 .0007 
20 .9704 8355 .7139 5558 3848 2322 .1201 .0527 .0060 .0003 
22 .9694 8253 .6959 .5303 3558 .2057 .1007 0413 .0039 .0002 
24 .9685 8160 .6798 5080 3312 1841 0858 0331 .0027 .0001 
26 .9677 .8076 .6653 4883 3102 .1665 .0741 .0270 .0019 .0001 
28 .9670 .7999 .6523 .4709 2921 1518 .0649 .0225 .0014 .0000 
30 .9664 .7929 .6405 4555 .2764 1395 .0574 .0190 .0010 .0000 
40 .9638 7655 5955 .3989 .2220 0998 .0355 .0098 .0003 .0000 
60 .9606 1328 5443 3390 1702 .0668 .0200 0045 .0001 .0000 
120 .9568 .6939 4869 .2776 A232 0411 .0101 0018 .0000 .0000 
ore) 9520 .6475 4234 .2170 0832 .0230 0045 .0007 .0000 .0000 

yy = 12 
2 .9873 .9793 .9746 .9691 .9628 9557 .9478 .9392 .9198 8977 
4 .9844 .9627 .9469 .9260 8993 8665 .8276 7829 .6790 5628 
6 .9818 .9432 9118 .8679 8104 7398 .6581 .5689 3872 .2304 
8 9794 9231 8743 8052 7158 6101 4956 3819 .1910 .0750 
10 9773 .9035 .8375 .7445 6277 4968 3661 .2494 .0900 0228 
12 9754 8850 .8029 .6890 5509 4050 .2708 .1635 0429 .007 1 
14 .9738 8680 7713 .6396 4860 3329 -2030 .1092 0212 .0024 
16 9723 8523 .7426 5963 4318 .2768 .1549 .0749 0110 .0009 
18 .9710 .8380 .7169 5585 3868 2333 .1206 .0529 .0060 .0003 
20 .9698 8250 .6938 5256 3493 1991 .0958 .0384 .0035 .0001 
22 .9687 8131 .6730 .4969 3179 1721 .0775 .0286 .0021 .0001 
24 .9677 8023 6543 4717 2914 1505 .0638 .0219 .0013 .0000 
26 .9668 .71924 .6376 .4496 .2690 .1330 0533 .O171 0009 .0000 
28 9659 .7833 .6223 4300 .2498 1187 .0452 .0136 .0006 .0000 
30 9652 .7750 .6086 4127 2333 -1069 .0389 0110 .0004 .0000 
40 .9622 7419 9556 .3492 .1770 .0701 0212 .0048 .0001 .0000 
60 9584 1019 4951 2833 1256 0417 0101 0018 0000 .0000 
120 9537 6534 4271 .2173 .0819 .0220 .004 1 .0005 .0000 .0000 


oe) .9476 5948 3530 1552 0479 .0100 0015 0001 .0000 .0000 


624 


TABLE VII (continued ) 


The Analysis of Variance 


a =0.05 
V2 g=.5 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 
V4 = 
2 9271 8617 8256 7847 7402 .6927 .6432 5926 AQIS .3950 
4 9141 8048 -TAIS 6694 5910 5095 4284 3509 .2169 1198 
6 .9077 .7768 .7010 6153 5238 4315 3431 .2629 1374 0611 
8 .9040 7610 6784 5858 .4883 3916 3015 2223 1054 0413 
10 .9017 7510 .6642 5675 .4666 .3680 .2775 1997 .0890 0322 
12 .9000 7440 .6544 5551 4521 3524 .2620 1854 0793 0272 
14 8988 71390 .6474 5462 4418 3414 2513 .1756 .0728 .0240 
16 8979 7351 .6420 5394 4341 .3333 2433 1685 .0683 0219 
18 8972 .7321 .6379 5342 4281 3270 .2373 1631 .0649 .0203 
20 8966 7297 6345 5300 4233 3220 2325 .1589 .0623 0192 
22 8961 7277 .6317 5265 4194 3180 2287 1555 .0603 0183 
24 8957 .7260 .6294 5236 4161 3146 2255 1527 0586 0175 
26 8954 7246 6274 212 4134 3118 .2228 1504 .0573 0169 
28 8951 71233 6258 5192 Alll .3094 .2206 1485 .0561 0165 
30 8948 7223 .6243 5173 .4090 3073 .2186 .1468 0551 0160 
40 8939 7185 6192 5110 .4020 3001 2119 1410 0518 0147 
60 8930 7147 .6140 5047 .3949 .2930 2053 1354 0487 0134 
120 8920 .7108 .6087 4983 .3879 .2859 .1988 1300 0457 0123 
oe) 8910 .7070 .6036 4920 3810 .2791 .1926 1248 .0430 0112 
VY =2 
2 .9324 8814 8527 8201 .7840 7451 .7038 .6608 5722 4837 
4 9201 8239 1657 .6976 6219 5414 4598 3804 .2400 1353 
6 9129 7891 7135 .6257 5303 4330 3396 .2554 1264 0520 
8 .9083 1672 .6810 5821 4769 3729 .2773 1955 .0821 .0273 
10 9052 .7523 6592 5536 .4430 3361 .2408 -1624 .0609 0175 
12 .9030 TAIT .6438 5336 4197 3115 .2173 1419 0490 0126 
14 9013 71337 6323 5189 4028 2941 .2010 1281 .0416 .0099 
16 -.9000 7274 6234 5077 3901 2812 1892 1183 .0367 0082 
18 8989 7225 .6164 4988 3802 .2713 1802 1110 .0331 .007 1 
20 8980 .7184 .6107 4917 3723 .2634 1732 1054 0305 .0063 
22 8973 .7150 .6059 4858 3658 .2570 .1675 . 1009 0285 .0057 
24 8967 7122 .6019 4808 3603 2517 .1629 .0973 .0269 .0052 
26 8961 £7097 5985 4767 3558 2472 .1590 .0943 .0256 .0048 
28 8957 .7076 5956 4730 3518 .2434 1558 .0918 0245 0045 
30 8953 7058 5930 .4699 3484 2401 .1530 .0896 .0236 .0043 
40 8938 6992 5839 4588 3365 .2288 1434 0824 .0207 .0035 
60 8923 6924 5746 .4476 3247 2177 1341 .0756 0181 .0029 
120 8908 .6855 5651 .4364 3129 .2069 1253 .0692 0157 0024 
oe) 8892 .6785 5556 4251 3013 .1963 1168 .0632 0137 0019 
yy= 3 
2 9342 8882 8623 8327 .7998 -7640 .7260 .6861 .6030 5187 
4 9221 8302 .7735 .7064 6311 5505 .4683 3880 .2453 1384 
6 9144 .7909 .7134 .6226 5235 4225 3264 .2407 1132 0435 
8 9092 71643 .6733 5683 -4570 3482 .2504 1694 .0639 .0184 
10 9056 7454 .6453 5314 4134 3019 .2059 1307 .0419 .0098 
12 9028 .7314 .6249 .5050 3831 .2709 .1776 .1074 .0305 .0061 
14 .9007 71207 .6093 4853 3611 .2490 1583 0922 0238 0042 
16 8990 7122 5972 4701 3443 .2328 1444 .0817 .0196 .0031 
18 8976 .7054 5874 4581 3313 .2204 1340 .0740 0167 0025 
20 8965 .6997 5794 4483 3208 .2106 1259 .0682 .0146 .0020 
22 8955 .6950 5728 4402 3122 .2026 1195 .0637 .0131 0017 
24 8947 .6909 5671 .4333 3051 1961 1143 .0601 O119 0015 
26 8940 .6875 5623 4275 .2990 1907 .1100 0571 0110 0013 
28 8934 .6845 5581 4225 .2938 1860 .1064 0547 .0103 .0012 
30 8928 6818 5544 4182 .2894 1820 1033 0526 .0097 0011 
40 8909 .6723 5414 4028 .2738 1684 .0930 .0458 .0078 .0008 
60 8888 .6624 5279 3872 2583 1552 .0833 .0397 0062 .0006 
120 8866 6522 5142 3716 2431 1425 .0743 .0342 0049 .0004 
oe) 8843 6415 5000 3557 2280 1304 0659 .0293 .0038 .0003 
74 =4 
2 9351 8917 8672 8391 .8079 .7738 .7375 .6993 6193 5374 
4 9232 8332 7771 .7103 .6350 5542 4714 3905 2466 1389 
6 9151 .7906 7112 6178 5158 4122 3143 2282 .1030 0375 
8 9094 .7602 .6649 5549 4389 3271 .2286 .1493 0515 0132 
10 .9052 .7378 6315 5110 3876 .2736 1788 .1076 .0301 0059 
12 .9020 7208 .6066 4791 3516 2380 1475 .0833 0199 0032 
14 8995 .7076 5875 4550 3253 2129 1266 .0680 0143 0019 
16 8975 .6970 5723 .4363 .3054 1945 1118 0577 0109 .0013 
18 8958 .6883 5600 4214 2898 1804 .1009 .0503 0087 .0009 
20 8945 6811 5498 .4092 .2774 1695 .0926 .0449 .0073 .0007 
22 8933 .6750 5413 3991 .2672 1607 .0861 .0408 .0062 .0006 
24 8923 .6698 5341 3907 2587 1535 .0808 .0376 0054 .0005 
26 8914 .6653 5279 3834 .2516 1475 .0765 .0349 .0049 0004 
28 8906 .6614 5225 3772 2455 1424 .0730 .0328 0044 .0003 
30 8899 .6579 5178 3718 .2402 1381 .0700 0311 .0040 0003 
40 8874 6454 5009 3526 2219 1234 .0601 .0254 0029 .0002 
60 8848 .6322 4833 3332 .2040 1095 0511 .0206 .002 1 .0001 
120 8819 6183 4652 3136 1865 .0965 0431 .0164 0015 .0001 
ove) 8789 .6038 4466 2940 1695 0844 .0360 .0130 0011 .0000 


Statistical Tables and Charts 625 


TABLE VII (continued ) 


a =0.05 
V2 o=.5 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 
y= 5 
2 .9356 8939 .8702 8431 8128 .7798 7445 -7074 .6293 5490 
4 9238 8349 7791 .7124 .6369 5558 4727 3914 .2467 1386 
6 9154 -7897 .7087 .6131 5088 4033 3044 2181 .0952 .0333 
8 .9093 7561 .6573 5432 4237 3099 2115 1342 .0430 0100 
10 .9048 .7308 .6193 4933 3660 .2509 .1579 .0909 0227 .0038 
12 9012 7111 .5904 .4566 3254 2118 1249 .0665 .0136 .0018 
14 8983 .6956 5679 4287 .2957 1845 .1033 .0516 .0090 .0010 
16 8960 .6829 5499 .4069 .2732 1646 .0883 0418 .0064 .0006 
18 8941 6725 5352 3895 2557 .1496 .0774 0351 .0048 .0004 
20 8924 .6638 5231 3753 2417 .1380 .0693 .0303 .0038 .0003 
22 8910 .6564 5129 3635 .2303 1288 .0630 .0267 .0031 .0002 
24 8898 .6501 5042 3536 .2209 1213 .0580 .0240 .0026 .0001 
26 8888 6445 .4967 3451 .2129 151 .0540 .0218 .0022 .0001 
28 .8878 .6397 .4901 .3379 .2062 .1099 .0507 .0201 .0019 .0001 
30 8870 .6354 4844 3315 .2003 1055 .0479 .0186 0017 .0001 
40 8840 .6198 .4638 3091 .1803 .0908 .0390 0142 0011 .0000 
60 8807 .6033 4423 .2864 .1609 .0772 .0313 .0106 .0007 .0000 
120 8771 5857 4201 .2638 1422 .0649 0247 .0078 0004 .0000 
oe) 8733 5671 3971 .2412 1245 0538 .0192 .0056 .0003 .0000 
Y= 6 
2 .9360 8953 .8722 .8457 8161 .7839 .7493 .7129 .6361 5569 
4 9242 8361 .7803 .7136 .6380 5567 4733 3916 .2464 1381 
6 9156 7887 .7063 .6090 5028 3959 .2962 .2100 0893 .0301 
8 .9092 £7525 .6506 5332 4109 .2958 .1978 1225 .0369 .0080 
10 9042 7245 -.6086 4782 3480 .2325 1417 .0784 .0177 .0026 
12 .9003 71024 .5761 4373 3036 .1908 .1077 0544 .0097 .0011 
14 .8972 .6847 .5506 4061 2711 .1619 0859 .0401 .0059 .0005 
16 8946 .6702 5301 3816 2465 1412 .0710 .0312 .0039 .0003 
18 8924 6582 5132 3621 2275 1257 .0605 0252 .0028 .0002 
20 8905 .6480 .4992 3461 .2124 1139 0528 .0210 .0021 .0001 
22 8889 .6394 4874 .3328 .2001 1045 .0469 .0179 .0016 .0001 
24 8875 .6319 4773 3216 .1900 .0970 .0423 0157 .0013 .0001 
26 8863 6253 .4686 3121 1815 .0908 .0387 .0139 0011 .0000 
28 8852 .6196 .4610 3039 1744 .0857 .0357 0125 .0009 .0000 
30 8843 6145 4543 .2968 -1682 .0814 .0333 0114 .0008 .0000 
40 .8807 5960 .4302 .2717 1471 .0672 .0256 008 1 .0004 .0000 
60 .8768 .5760 .4050 .2464 .1270 0545 .0193 .0055 .0002 .0000 
120 8724 5547 .3789 .2214 .1082 0434 .0141 .0037 .0001 .0000 
oe) .8677 5319 3520 .1967 .0907 .0339 .0101 .0024 .0000 .0000 
y= 
2 .9363 8963 .8736 .8476 8185 .7868 .7527 .7168 .6410 5627 
4 9245 8368 7811 .7144 .6387 5571 4735 3916 .2460 .1376 
6 9157 1878 .7042 .6054 4978 3897 .2895 .2035 .0846 .0278 
8 .9090 .7492 .6449 5247 4002 .2841 .1868 1133 .0323 .0065 
10 .9038 .7189 5992 4652 3328 2174 1288 0689 0143 0019 
12 8996 .6947 5636 .4207 2852 1738 .0944 0454 .0072 .0007 
14 8961 .6750 5353 3866 .2505 1439 .0726 .0320 0041 .0003 
16 8933 6588 5125 3598 2243 1226 0582 .0238 .0025 .0001 
18 8908 6452 .4936 .3383 2041 .1070 .0482 0185 .0017 .0001 
20 8888 .6336 .4779 3208 1882 0951 .0409 .0149 .0012 .0000 
22 8870 6238 .4646 3062 1753 0858 0355 .0123 .0009 .0000 
24 8854 6152 4532 .2940 .1647 0785 .0314 .0105 .0007 .0000 
26 .8840 .6077 4433 .2836 1559 0725 .0282 0091 .0005 .0000 
28 8828 6011 4347 2747 1485 .0676 .0256 .0080 .0004 .0000 
30 8817 5952 4272 .2669 1421 .0634 .0234 .007 1 .0003 .0000 
40 .8776 5737 .3998 .2396 1206 0501 .0170 .0046 .0002 .0000 
60 .8730 5504 3713 2124 1005 .0387 O19 .0029 .0001 .0000 
120 .8679 5253 .3417 1857 0821 .0290 .0080 .0017 0000 .0000 
oe) .8622 4983 3112 1597 .0656 0211 .0052 .0010 .0000 .0000 
yy = 8 
2 .9365 8971 .8747 .8490 8203 7889 .7553 .7198 .6448 5671 
4 9274 8374 7817 .7149 .6391 5574 4735 3914 .2456 1371 
6 9158 -7869 .7024 .6023 4935 3845 .2839 1981 .0809 0259 
8 .9088 7464 .6398 5173 3910 .2744 .1777 1059 0289 0055 
10 .9033 .7140 5910 .4540 3200 .2049 1184 0615 0118 .0014 
12 8989 .6878 .5526 .4063 .2697 1598 0838 .0387 0055 0005 
14 8951 .6663 5218 3696 .2330 .1292 .0624 .0260 .0029 .0002 
16 .8920 6484 .4968 .3407 .2056 .1077 .0485 .0186 .0017 .0001 
18 8894 .6334 4761 .3176 .1846 0921 .0390 .0139 .0010 .0000 
20 8871 .6205 4588 2988 1680 .0804 .0323 .0108 .0007 .0000 
22 8851 .6095 444] .2832 1548 0713 .0274 .0087 .0005 .0000 
24 8834 5999 4315 .2700 .1439 .0642 .0237 .0072 .0003 .0000 
26 8819 S915 .4206 2589 .1349 0585 .0208 .0060 .0003 .0000 
28 8805 5840 4lll .2493 1274 0538 .0186 0052 .0002 .0000 
30 .8793 5774 .4027 .2410 .1209 .0499 .0168 .0045 .0002 .0000 
40 8746 5530 3725 .2120 .0995 .0377 0114 .0027 .0001 .0000 
60 .8694 5264 .3408 1834 .0798 .0275 .0074 0015 .0000 .0000 
120 .8635 4975 3081 .1556 0623 .0193 .0046 .0008 .0000 .0000 


ove) 8568 .4663 .2745 1292 0472 .0130 .0027 .0004 .0000 .0000 


626 


TABLE VII (continued) 


V2 


@=.5 


.9366 
9249 
9158 
.9087 
.9029 
8982 
8943 
8909 
8881 
8856 
8835 
8816 
8799 
8784 
8770 
8718 
.8660 
8592 
8514 


9368 
9250 
9158 
9085 
.9026 
8976 
8935 
8899 
8869 
8843 
8819 
8799 
.8780 
8764 
8749 
8692 
8627 
8551 
8462 


.9369 
9252 
9159 
.9082 
9019 
8966 
8921 
8882 
8848 
8818 
8792 
8768 
8747 
8728 

8710 
8643 
8565 
8472 
8359 


1.2 


8756 
7821 
.7007 
6354 
5838 
5428 
5097 
4827 
.4604 
4416 
4257 
4120 
4002 
3898 
.3807 
3477 
3133 
.2778 
2417 


8762 
7825 
6992 
6315 
5774 
5340 
.4990 
4702 
4463 
4262 
.4091 
3944 
3817 
3705 
3607 
3253 
.2884 
.2506 
2124 


8772 
7829 
.6968 
6250 
5666 
5192 
4805 
4485 
4219 
3994 
.3803 
3638 
3496 
3371 
3261 
.2865 
.2456 
2042 
.1632 


Vv} 


Vv] 


The Analysis of Variance 


2.0 


1573 
4735 
.2792 
1702 
.1099 
0753 
0543 
.0409 
.0320 
0259 
0214 
0181 
0156 
0137 
0121 
0077 
.0046 
.0026 
0014 


7589 
4734 
2751 
.1638 
1028 
.0683 
0478 
.0350 
0267 
0210 
.0170 
0141 
O119 
0102 
0089 
0053 
0029 
0015 
.0007 


7614 
4731 
.2686 
1536 
0917 
0577 
.0382 
0265 
.0192 
0144 
O11] 
.0088 
.0072 
0059 
.0050 
.0026 
0012 
0005 
.0002 


2.6 


6477 
2452 
0778 
0262 
0100 


0021 
0011 


Statistical Tables and Charts 627 


TABLE VII (continued ) 


a =0.10 
V2 @=.5 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 
y= 
2 .8582 1443 .6846 .6202 5534 .4863 .4209 3588 .2491 .1628 
4 .8410 .6773 5919 5017 4118 3266 .2500 .1846 .0899 .0375 
6 .8336 .6498 5552 .4570 3613 .2738 .1985 .1373 .0570 .O194 
8 .8296 .6353 5363 .4344 3367 .2490 .1753 .1172 .0447 .0137 
10 .8271 6265 5248 .4209 .3223 .2348 .1623 .1063 .0385 .O11O 
12 8254 .6205 517] .4120 3128 .2256 154] .0996 .0348 .0096 
14 .8242 .6162 5116 .4057 3062 2192 1485 .0949 .0324 .0086 
16 .8232 .6130 5075 4010 3012 2145 .1444 .0916 .0307 .0080 
18 8225 .6104 5043 3973 .2974 .2109 1412 .0891 .0294 .0075 
20 8219 .6084 5017 .3944 .2944 .2081 .1388 .087 I .0285 .0072 
22 8214 .6068 .4996 3920 .2920 .2058 .1368 .0856 .0277 .0069 
24 .8210 .6054 .4979 .3900 .2900 .2038 1351 .0843 .0271 .0067 
26 .8207 .6042 .4964 3884 .2882 .2022 .1338 .0832 .0266 .0065 
28 .8204 .6032 4951 3869 .2868 .2009 .1326 .0823 .0261 .0063 
30 8201 .6023 .4940 .3857 .2855 .1997 1316 .0815 .0257 .0062 
40 8192 5993 .4902 3814 2811 .1956 1281 .0788 0245 .0058 
60 .8183 5962 .4864 3771 .2768 .1916 .1248 .0762 .0233 .0054 
120 8174 .5932 .4826 .3729 .2726 .1877 1215 .0737 .0222 .0050 
oo 8165 5901 .4788 3686 .2683 .1838 1183 .0713 0211 .0047 
yy = 2 
2 .8669 .7746 7252 .6707 .6130 5536 .4939 4355 3265 .2333 
4 .8486 .6981 .6159 5268 4358 3481 .2680 .1987 .0972 .0405 
6 .8392 .6593 5623 .4595 3586 .2662 .1876 .1525 0471 .0140 
8 8335 .6369 5319 .4228 3183 .2260 . 1508 .0944 .0302 .0073 
10 .8298 .6223 5126 .4000 .2940 .2027 .1305 .0783 .0226 .0048 
12 .8272 .6122 .4994 .3846 .2780 .1877 .1179 .0686 .0184 .0035 
14 £8252 .6047 .4897 .3734 .2666 .1772 .1093 .0622 .0158 .0028 
16 .8237 .5990 4823 3651 2581 .1696 .1031 .0578 .0140 .0024 
18 8225 5945 4765 3586 2516 .1638 .0984 .0544 .0128 .002 1 
20 8215 5908 .4719 .3533 .2464 .1592 .0948 .0519 .O119 .0019 
22 .8207 5878 .4680 .3490 .2422 .1555 .0919 .0498 .0012 .0017 
24 .8200 5853 .4648 3455 .2387 .1524 .0895 .0482 .0106 .0016 
26 .8194 5831 .4621 3425 2357 .1498 .0875 .0468 .0102 0015 
28 .8190 5812 .4598 3399 .2332 .1477 .0859 .0457 .0098 .0014 
30 8185 .5796 4577 .3376 .2310 1458 0844 .0447 .0095 .0014 
40 8169 5739 .4506 .3298 .2235 .1394 .0796 0415 .0084 0011 
60 .8153 5681 .4433 .3220 .2160 .1331 .0749 .0384 .0075 .0010 
120 8137 5622 .4361 .3142 .2087 .1270 .0705 .0355 .0067 .0008 
oe) .8120 5562 4288 .3064 .2015 1211 .0662 .0328 .0059 .0007 
Yyy= 
2 .8700 .7858 .7403 .6899 .6359 5799 5231 .4668 3597 .2655 
4 8513 .71047 .6230 5336 4416 .3525 .2709 .2002 .0970 .0399 
6 .8406 6585 5581 4514 3468 .2521 .1729 1117 .0386 .0103 
8 .8338 .6298 .5190 .4040 .2953 .2016 .1281 .0755 .0208 .0041 
10 8291 .6106 .4933 .3738 .2638 .1724 .1038 .0574 .0135 .0022 
12 .8257 5968 .4752 .3531 .2429 .1537 .0890 .0470 .0098 .0014 
14 .8232 5864 .4618 .3380 .2280 .1408 .0792 .0404 .0077 .0010 
16 8211 5784 4516 .3266 .2170 1315 .0723 .0359 .0064 .0007 
18 .8195 5720 .4434 3177 .2085 .1244 .0671 .0326 .0055 .0006 
20 .8182 5668 .4369 .3105 .2017 .1189 .0632 .0301 .0049 .0005 
22 .8170 5624 4314 .3047 .1963 1145 .0601 .0282 .0044 .0004 
24 8161 5588 .4268 .2998 .1917 .1108 .0576 .0267 .0040 .0004 
26 8153 5556 .4230 .2956 .1879 .1078 .0555 .0255 .0038 .0003 
28 8145 5529 .4196 2921 .1847 .1053 .0537 .0244 0035 .0003 
30 8139 5506 .4167 .2890 1819 1031 .0523 .0236 .0033 .0003 
40 8117 5421 .4063 .2782 .1722 .0956 .0473 .0207 .0027 .0002 
60 .8093 5335 .3959 .2674 .1627 0885 .0427 .0182 .0022 .0002 
120 .8069 5246 .3853 .2567 .1535 .0817 .0384 .0159 .0018 .0001 
oo .8044 5156 .3745 .2460 .1444 .O752 .0344 .0138 .0015 .0001 
yy = 4 
2 .8716 .7916 .7482 .6999 .6480 5939 5387 .4837 3781 .2836 
4 8527 1077 .6259 5360 .4432 3532 .2708 .1995 .0958 .0390 
6 .8410 .6559 5527 .4429 3358 .2399 .1610 1012 .0327 .0080 
8 .8333 .6223 .5066 3871 .2758 1821 1110 .0622 0151 .0026 
10 .8279 5989 .4754 .3509 .2388 .1489 .0846 .0436 .0086 0011 
12 8238 5818 .4531 .3257 .2142 .1279 .0689 .0333 .0056 .0006 
14 .8207 5689 .4365 .3074 .1968 .1136 .0587 .0271 .0040 .0004 
16 .8182 5587 .4236 2935 .1839 .1033 .0517 .0229 .0031 .0002 
18 8161 5505 .4133 .2826 .1740 .0957 .0467 .0201 .0025 .0002 
20 .8144 5438 .4050 .2738 .1662 .0898 .0428 .0179 .0021 .0001 
22 .8130 5382 3981 .2666 .1599 .0851 .0398 .0163 .0018 .0001 
24 8118 5334 .3922 .2606 .1547 .0812 .0375 O51 .0016 .0001 
26 .8107 5293 .3872 .2555 .1503 .0781 .0355 .0141 .0014 .0001 
28 .8098 5258 .3829 .2512 .1466 .0754 .0339 .0133 .0013 .0001 
30 .8090 5226 .3792 .2474 1434 .0732 .0326 .0126 .0012 .0001 
40 .8061 S115 3659 .2343 1325 .0655 .0281 .0104 .0009 .0000 
60 .8030 5000 .3524 2211 .1219 .0584 .0241 .0085 .0007 .0000 
120 .7997 4881 3387 .2081 1117 .0518 .0206 .0069 .0005 .0000 


00 .1963 4758 3248 1952 1019 0457 .0174 .0056 .0003 .0000 


628 The Analysis of Variance 


TABLE VII (continued ) 


a =0.10 
V2 o=.5 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.6 3.0 
4 =5 
Z .8726 7952 -7530 .7061 .6555 .6026 5485 4943 3897 .2953 
4 8534 7093 6273 .5369 4435 3528 .2699 1983 0945 0381 
6 8412 6532 5477 .4354 3266 .2300 1516 0933 .0286 .0066 
8 8327 6154 4957 3728 2599 .1668 0982 0528 OLLS 0017 
10 8266 5885 4599 3316 .2187 .1308 0707 0343 0058 0006 
12 8219 5685 4339 3029 1913 1084 .0548 0245 .0034 0003 
14 8183 5532 4144 .2818 .1720 .0934 0448 0188 .0022 0002 
16 8153 5410 399] .2658 1578 .0827 .0380 0152 .0016 .0001 
18 8129 5311 3869 2532 .1470 .0749 .0332 0128 0012 0001 
20 8109 5229 .3770 .2432 1385 0689 0297 0110 0010 0000 
pepo 8092 5161 3687 .2349 .1316 0642 0270 .0097 .0008 .0000 
24 8077 5103 3618 .2280 .1260 .0604 0249 0087 .0007 .0000 
26 8064 5053 3558 woe2 1213 0573 0232 .0080 .0006 .0000 
28 8053 5009 3507 .2173 1174 0547 .0218 0074 0005 0000 
30 8043 4971 3462 .2129 .1140 0525 .0206 .0068 .0004 0000 
40 8007 4833 .3302 .1979 .1024 0452 0169 .0053 .0003 .0000 
60 7968 4689 3140 .1830 .0914 0386 .0137 .0040 .0002 .0000 
120 1927 4539 .2974 1684 .0810 0327 .0109 .0030 000 1 .0000 
oe) .7883 4384 .2807 .1540 .0712 .0274 .0086 .0022 .0001 .0000 
VY =6 
2 8732 1976 .7563 .7103 .6606 6085 5552 5016 3978 .3035 
4 8540 -7102 628] 5373 4434 3522 .2689 197] .0934 .0373 
6 8411 6507 5432 4291 3189 .2220 1442 0872 0256 0056 
8 8321 6094 4864 3609 .2469 1548 0884 0459 .0092 0012 
10 8254 5794 4465 3155 .2023 1168 .0604 0278 0041 .0004 
12 8202 5568 4174 .2837 .1728 0935 .0446 .0187 0022 0001 
14 8161 5392 3952 .2603 1522 .0781 0350 .0136 0013 .0001 
16 8127 252 3779 2426 .1371 .0674 .0286 0104 .0009 .0000 
18 8100 5137 3640 2287 1257 0596 0243 .0084 0006 .0000 
20 .8076 5042 3526 .2176 .1167 0538 0211 .0070 0005 0000 
22 8056 4962 3432 .2085 .1096 0492 .0187 .0060 .0004 0000 
24 8039 4894 .3352 .2009 1038 0456 0169 0052 .0003 .0000 
26 8024 4835 3283 1945 0989 0426 0154 .0046 .0002 .0000 
28 8011 4783 3224 1891 .0949 0402 0142 0042 0002 .0000 
30 £7999 4738 3172 .1844 0914 0382 0133 .0038 0002 .0000 
40 7956 4574 .2989 .1680 .0797 0315 .0103 .0027 000 1 0000 
60 7910 4403 .2802 1519 .0688 0257 0078 .0019 .0001 .0000 
120 £7859 4223 .2612 1362 .0587 0206 0058 .0013 .0000 .0000 
oe) 7805 4035 2420 1210 0494 0162 0042 .0009 0000 0000 
4 = 7. 
2 8737 1994 7587 .7133 .6643 6129 5600 5069 4037 .3095 
4 8543 .7108 6285 .5373 4431 3515 .2680 .1960 0924 .0367 
6 8411 6485 5394 4238 3126 2154 1383 .0825 .0234 0049 
8 8315 6042 4784 3509 .2362 1451 .0808 .0407 .0076 0009 
10 8243 5714 4351 3020 .1890 .1057 0525 0231 0031 0002 
12 8186 5465 4030 .2675 .1578 0819 .0371 .0146 0015 0001 
14 8141 5269 3786 .2423 .1362 .0665 0279 0101 .0008 .0000 
16 .8103 S111 3594 2231 1204 0559 0221 .0074 .0005 .0000 
18 8072 4982 3440 .208 1 1086 0483 0181 0057 0003 0000 
20 8046 4874 3313 .1962 .0995 0427 0153 .0046 0002 .0000 
22 8024 4784 3208 .1864 0922 .0383 0132 .0038 0002 .0000 
24 .8004 .4705 3119 .1783 .0863 .0349 0117 .0032 0001 .0000 
26 1987 4638 3043 715 0815 0322 .0105 .0028 .0001 .0000 
28 7972 4579 2977 .1657 .0774 .0300 0095 .0024 0001 0000 
30 7958 4527 2919 1606 .0740 .0281 0087 .0022 .0001 0000 
40 .7908 4339 .2715 1433 .0625 .0222 .0063 .0014 .0000 .0000 
60 7854 4140 .2507 1264 0520 0172 0045 .0009 0000 .0000 
120 .7794 3932 .2296 1102 .0426 0130 0031 .0006 .0000 .0000 
oo .7728 3712 .2083 .0948 0342 .0096 0021 .0003 .0000 .0000 
4 =8 
2 8740 .8006 -7604 .7156 .6670 -6160 5636 5109 408 1 3140 
4 8546 7112 6287 5373 4427 3509 .2671 .1950 0915 0362 
6 8410 .6466 5361 .4192 3072 .2100 1334 .0786 0216 0043 
8 8310 5996 4716 3423 Pa i 1371 .0747 .0367 .0064 0007 
10 8233 5645 4251 .2904 .1778 .0968 0465 0196 .0023 0002 
12 8172 5373 3906 .2538 1454 0727 0315 .O117 .0010 .0000 
14 8123 5159 3641 .2269 .1230 0574 0228 .0077 .0005 .0000 
16 8082 4986 3432 .2066 .1069 .0470 0174 0054 .0003 .0000 
18 8048 4843 3264 -1907 .0949 .0397 0138 .0040 0002 .0000 
20 8019 4724 3126 1781 0857 .0344 0114 .0031 .0001 .0000 
22 .7994 4622 3012 .1678 .0784 .0303 .0096 .0025 0001 .0000 
24 1972 4535 .2914 1593 .0726 .0272 .0083 .0020 .0001 .0000 
26 7952 .4460 2831 1521 .0678 0247 .0072 .0017 .0000 .0000 
28 .7935 4394 .2759 .1460 .0638 .0226 .0065 0015 .0000 .0000 
30 .7920 4335 .2696 .1408 .0604 .0210 .0058 .0013 .0000 .0000 
40 .7863 4123 .2473 1228 .0494 0158 0040 .0008 0000 0000 
60 -7801 3899 .2247 .1056 .0395 0116 .0026 0004 .0000 .0000 
120 773) 3663 .2019 .0893 .0309 .0082 .0016 0002 .0000 .0000 


oe) 7653 3414 1791 .0740 .0235 0056 .0010 .0001 .0000 .0000 


Statistical Tables and Charts 629 


TABLE VII (continued ) 


a =0.10 
V2 oe=.5 1.0 1.2 1.4 1.6 1.8 2.0 2:2 2.6 3.0 
y= 9 
2 8743 8017 7619 7174 .6693 .6186 .5666 5141 A117 3177 
4 8548 7115 6288 5372 4423 3503 .2663 .1942 .0908 0357 
6 .8409 .6449 5333 4153 .3027 .2055 .1294 0755 .0202 .0039 
8 8305 5956 4656 3350 .2198 .1305 .0698 .0335 .0055 .0006 
10 8224 5583 4165 .2805 .1685 .0894 .0417 .0170 .OO19 .0001 
12 8160 5293 3797 .2420 .1350 .0653 .0271 .0096 .0007 .0000 
14 8107 5061 3513 .2137 L121 0501 .0189 .0060 .0003 .0000 
16 8063 A873 3290 .1924 0958 .0401 .0139 .0040 .0002 0000 
18 8025 A718 3109 .1758 .0837 .0331 .0107 .0028 0001 .0000 
20 .7994 A588 .2962 .1627 0745 0281 .0086 .0021 .0001 .0000 
22 .1966 4476 .2838 .1520 .0673 0243 .0071 .0016 .0000 .0000 
24 1942 4381 .2734 1432 .0616 0214 .0059 .0013 .0000 .0000 
26 7921 4298 .2644 1358 .0569 .O191 .005 1 0011 -0000 0000 
28 .7902 4225 .2567 .1295 .0530 .0173 .0045 .0009 .0000 .0000 
30 7885 Al61 .2500 .1241 .0498 0159 .0040 .0008 .0000 .0000 
40 7821 3927 2261 1058 .0393 .O114 .0025 .0004 .0000 .0000 
60 .7750 3678 .2019 0885 .0302 .0078 0015 .0002 .0000 -.0000 
120 7671 3415 1777 .0724 .0224 .0052 .0009 .0001 -.0000 .0000 
oe) 7581 3138 -1537 .0577 .0161 .0033 0005 .0000 .0000 .0000 
y= 10 
2 8746 8025 .7630 7188 .6710 .6207 5689 5167 4146 3206 
4 8550 W117 .6289 .5370 4419 3497 .2656 1935 .0902 0353 
6 8408 6434 5308 A119 .2988 .2016 .1260 .0728 .O191 .0036 
8 8301 5921 4604 3287 .2133 1250 .0657 .0309 .0049 .0005 
10 8216 5528 A088 2719 .1605 .0834 .0378 0149 0015 .0001 
12 8148 5220 .3700 2317 .1263 .0592 .0237 .008 | -.0006 .0000 
14 8092 4974 3401 .2023 .1030 .0443 .0160 .0048 .0002 .0000 
16 8045 A772 3164 .1802 .0866 .0346 O114 .0031 0001 .0000 
18 8005 4605 2972 .1630 .0745 0279 .0085 .0021 0001 .0000 
20 7971 4464 2815 .1494 .0654 0232 .0066 0015 .0000 0000 
22 7941 4344 .2684 1384 0583 .0197 .0053 OO .0000 .0000 
24 7914 4240 .2573 .1294 0527 0171 .0044 .0009 .0000 .0000 
26 7891 4150 .2479 .1219 .0482 O15] .0037 .0007 .0000 .0000 
28 7870 4071 .2397 LISS 0445 0134 .0031 .0006 .0000 0000 
30 7852 4001 .2326 1101 0414 0121 .0027 .0005 .0000 .0000 
40 7782 3746 .2073 .0916 0315 .0083 .0016 .0002 .0000 .0000 
60 .7703 3475 1819 .0745 .0232 .0054 .0009 .0001 .0000 .0000 
120 7613 3186 .1566 .0588 .0163 .0033 .0005 .0000 .0000 .0000 
oe) 7510 2883 1319 .0449 O110 .0019 .0002 .0000 .0000 .0000 
y= 12 

2 8749 8037 .7646 7210 .6736 .6237 5723 5204 4188 3250 
4 8552 .7120 .6289 5368 4413 3488 .2645 .1923 .0893 .0348 
6 8406 .6409 5267 4064 .2925 .1954 .1207 .0688 .0174 .0032 
8 8293 5862 S17 3182 .2029 1161 .0594 .0270 .0039 .0003 
10 8203 5436 3961 2577 1477 0739 .0320 .0120 OO11 0001 
12 8129 .5097 3538 .2149 1123 .0500 .0187 .0059 .0003 .0000 
14 8067 4823 3210 .1836 .0887 .0356 .O118 .0032 0001 .0000 
16 8014 A597 .2950 .1603 .0722 .0266 .0079 0019 .0001 .0000 
18 .1969 .4409 .2740 1423 .0603 .0206 .0056 0012 .0000 .0000 
20 .7930 4249 2568 1281 .OS15 .0164 .0041 .0008 .0000 .0000 
22 7896 AlI3 2424 -1167 .0448 0135 .003 1 .0006 .0000 .0000 
24 .7865 3994 .2303 .1074 -0396 0113 .0025 -.0004 .0000 .0000 
26 7838 3892 .2199 .0998 .0354 .0096 .0020 .0003 .0000 .0000 
28 -7814 3801 2110 .0933 .0320 .0083 .0016 .0002 -0000 .0000 
30 -7792 3721 .2032 .0879 .0292 .0073 .0014 .0002 .0000 .0000 
40 -7709 3427 .1759 .0697 .0207 0045 -.0007 0001 .0000 .0000 
60 7614 3114 .1486 .0532 .0139 .0026 .0003 -.0000 .0000 .0000 
120 -7503 2781 1220 .0389 -.0087 .0013 .0001 .0000 .0000 .0000 


ore) .1373 2431 0967 .0270 .005 1 .0006 0000 .0000 .0000 .0000 


From M. L. Tiku, “Tables of the Power of the F-Test,” Journal of the American 
Statistical Association, 62 (1967), 525-539 and M. L. Tiku, “More Tables of 
the Power of the F-test.” Journal of the American Statistical Association, 67 
(1972), 709-710. Abridged and adapted by permission. 


630 The Analysis of Variance 


Table VII. Power Values and Optimum Number of Levels 
for Total Number of Observations in the One-Way Random 
Effects Analysis of Variance F Test 


This table gives power estimates in the one-way random effects analysis of 
variance F test for specified values of 9 (the value of o2 /o2 under Ho), @ 
(the value of oe i, o? under H,), N = an (total number of observations), a (the 
number of treatment groups or levels), and @ (the level of significance). For 
example, in a one-way random effects analysis of variance, consider the simple 
hypothesis that there are no treatment effects (99 = 0) and the researcher wants 
to reject the null hypothesis if o7/o? is as large as 1.0 (0 = 1.0) ata = 0.05. 
For 20 treatment groups with 5 subjects per group (a = 20,n = 5, N = 100), 
the power of the test 1s equal to 0.998. 


Statistical Tables and Charts 


631 


a = 0.01 
8) = 0.00 
nN 8 0.2 0.4 0.6 0.8 1.0 2.0 3.0 4.0 
10 2, .045 2, .089 Zt 32 2 lhZ 2, .208 2, 341 2, .426 2, .485 
20 2,.114 2, .214 4, .302 4, .394 4, .471 5, .706 5, .822 5, .881 
30 3, .180 3, 348 5, .474 5, .586 5, .668 6, .875 10, .948 10, .977 
40 3, .246 4, .463 5, .615 5, .716 8, .795 10, .954 10, .986 13, .995 
50 3, 306 5, 561 5, .708 7, 809 10, .878 10, .981 16, .996 16, .999 
60 4, .383 6, .644 6, .788 10, .881 10, .930 15, .994 20, .999 12, 1.00 
70 5, .439 7, 713 10, .852 10, .924 14, .960 14, .998 14, 1.00 10, 1.00 
80 5, .497 8, .770 10, .896 10, .950 16, .977 20, .999 11, 1.00 9, 1.00 
90 5, 547 9, 817 10, .925 15, .970 18, .988 15, 1.00 10, 1.00 8, 1.00 
100 5, 591 10, .855 11, .946 20, .980 20, .993 14, 1.00 9, 1.00 8, 1.00 
300 15, .965 30, .999 8, 1.00 7, 1.00 6, 1.00 6, 1.00 6, 1.00 5, 1.00 
500 15, 1.00 8, 1.00 7, 1.00 7, 1.00 6, 1.00 6, 1.00 6, 1.00 5, 1.00 
8) = 0.10 
n\ @ 0.3 0.5 0.7 0.9 1.0 2.0 3.0 4.0 
10 2, .032 2, .059 2, 089 2, .118 2 al3e 2, .250 2, .334 2, 396 
20 2, 057 4, .120 4, .194 4, .267 4, .302 5, .567 5, .718 5, .804 
30 3, .082 5, .188 5, 302 6, .405 6, .454 6, .751 10, 891 10, .948 
40 4, .106 J, 2202 8, .395 8, 527 8, .582 10, .878 10, .958 13, .984 
50 5, .130 7, 309 10, .484 10, .629 10, .686 10, .931 16, .984 16, .996 
60 6, .153 10, .371 10, .569 12, .713 12, .768 15, .971 20, .996 20, .999 
70 7, .176 10, .430 14, .636 14, .781 14, .830 23, .985 23, .999 17, 1.00 
80 8, .199 10, .479 16, .697 16, .834 16, .877 20, .994 26, 1.00 15, 1.00 
90 10, .223 15, .530 15, .751 18, .876 18, .912 30, .997 18, 1.00 14, 1.00 
100 10, .245 14, 569 20, .795 20, .907 20, .978 25, .999 16, 1.00 12, 1.00 
300 30, .632 50, .967 60, .998 50, 1.00 30, 1.00 17, 1.00 10, 1.00 8, 1.00 
500 62, .850 50, .998 22, .100 16, 1.00 15, 1.00 10, 1.00 8, 1.00 8, 1.00 
00 = 0.50 
n\2 0.7 0.8 0.9 1.0 2.0 3.0 4.0 5.0 
10 2, .018 2, .023 2, .028 2, -034 2, .095 2, 155 2, .208 $4299 
20 4, .024 4, .034 4, .045 4, .057 5, .218 5, .380 5, 508 5, .604 
30 5, .029 6, .043 6, .060 6, .080 10, .329 10, .570 10, .730 10, .828 
40 8, .034 8, .053 8, .076 8, .102 10, .445 13, .344 13, .847 13, .919 
50 10, .039 10, .062 10, .091 10, .125 16, .526 16, .799 16, .916 16, .963 
60 12, .043 12, .071 12, .107 15, .149 20, .634 20, .885 20, .964 20, .988 
70 14, .048 14, .081 14, .122 14, .171 23, .703 23, .926 23, .982 23, .995 
80 16, .052 16, .090 20, .139 20, .197 20, .762 26, .953 26, 991 40, .998 
90 18, .056 18, .099 18, .154 18, .218 30, .823 30, .975 30, .996 45, .999 
100 20, .061 20, .109 25, .175 25, .246 33, .856 33, .985 33, .998 30, 1.00 
500 125, .255 125, 511 125, .739 125, .884 68, 1.00 20, 1.00 15, 1.00 13, 1.00 
1000 250, .503 250, .828 333, .966 190, 1.00 35, 1.00 20, 1.00 15, 1.00 13, 1.00 
8) = 1.00 
n\2 1.2 1.4 1.6 1.8 2.0 3.0 4.0 5.0 
10 2, 015 2, .020 2, .025 2, 032 2, .038 2, 074 2,111 2, 146 
20 4, 017 5, .027 5, .039 5, 054 5, .070 5, 166 5, .270 5, 365 
30 6, .020 6, .034 6, .052 6, .073 10, .098 10, .260 10, .429 10, 570 
40 10, .022 10, .040 10, .065 10, .096 10, .132 10, .344 13, 550 13, .702 
50 10, .024 10, .046 10, .076 12, .113 12, .157 16, .425 16, .653 16, .799 
60 15, .026 15, .053 15, 091 15, .140 20, .198 20, .525 20, .760 20, .885 
70 14, .027 17, .058 23, .102 23, .161 23,229 23, 592 23, .821 23, .926 
80 20, .029 20, .065 20, .118 20, .185 20, .262 26, .651 26, .869 40, .955 
90 22, .031 30, .071 30, .132 30, .211 30, .302 30, .721 30, .914 30, .975 
100 25, .033 25, .078 25, .145 33, .233 33, .333 33, .765 33, .938 50, .985 
500 166, .103 166, .365 166, .678 166, .882 166, .967 94, 1.00 34, 1.00 24, 1.00 
1000 333, .202 333, .677 333, .994 250, 1.00 142, 1.00 52, 1.00 34, 1.00 24, 1.00 


632 


Table VIII (continued ) 


The Analysis of Variance 


a = 0.05 
85 = 0.00 
nN Ad 0.2 0.4 0.6 0.8 1.0 2.0 3.0 4.0 
10 2, .142 2, .220 2, .282 22393 2, .374 3. .518 3, .622 5, .693 
20 2, .241 4, .386 4, .507 4, 596 4, .662 5, .847 5, 914 5, .945 
30 3, 342 5, .530 5, .666 6. .756 6, .818 10, .949 10, .984 10, .994 
40 4, .424 5. .645 8.768 8. .854 8, .904 10.984 13, .996 13, .999 
50 5.495 5.724 10, .843 10. .914 10, .950 16, .994 16, .999 12, 1.00 
60 5, 564 6, .792 10, .900 12, .950 12, .974 20. .999 12, 1.00 10. 1.00 
70 5, 621 7, .843 10, .934 14, .971 14, .987 23. 1.00 10, 1.00 8, 1.00 
&0 5. .668 10, .883 10, .955 16, .983 16, .993 13, 1.00 9, 1.00 8, 1.00 
90 6, .715 10, .913 15, .965 18, .990 18, .997 11, 1.00 8. 1.00 7. 1.00 
100 7, .746 10, .933 14, .980 20, .995 20, .998 10, 1.00 8. 1.00 7, 1.00 
300 20, .989 23. 1.00 8, 1.00 7, 1.00 6, 1.00 5. 1.00 5. 1.00 5, 1.00 
500 10. 1.00 7, 1.00 7, 1.00 6. 1.00 6, 1.00 5. 1.00 5. 1.00 5, 1.00 
Ao = 0.10 
n\ Q0 0.3 0.5 0.7 0.9 1.0 2.0 3.0 4.0 
10 2, .112 2. .169 2. .220 25-203 2, .282 3, .436 3, 547 5, .628 
20 4, .163 5; .283 4. .386 5, .473 5, .515 5. .753 5, .854 10, .907 
30 5, .211 5, .377 6, .513 6. .618 6, .661 10, .884 10, .962 10, .984 
40 5, .255 8, .445 8, .616 8. .727 10, .771 10, .952 13, .988 13, .996 
50 7, .290 10, .526 10, .698 10, .806 10, .843 16, .978 16, .996 16, .999 
60 10, .325 10, .592 12, .764 15. .863 15, .897 20. .993 20, .999 14, 1.00 
70 10, .364 10, .644 14, .816 14, .904 14, .930 23, .997 17, 1.00 13. 1.00 
80 10, .397 16, .692 16, .857 20. .934 20, .955 26, .999 15, 1.00 11, 1.00 
90 10, .426 15, .738 18, .890 18, .954 18, .969 30. 1.00 14, 1.00 10, 1.00 
100 11, .453 20, .771 20, .915 25, .969 25, .981 20, 1.00 12. 1.00 10, 1.00 
300 37, .821 50, .991 48, 1.00 26. 1.00 21, 1.00 15, 1.00 8, 1.00 7. 1.00 
500 71, .949 29, 1.00 17, 1.00 13, 1.00 12, 1.00 9. 1.00 8, 1.00 7. 1.00 
69 = 0.50 
n\?2 0.7 0.8 0.9 1.0 2.0 3.0 4.0 5.0 
10 2, .076 2, .090 2, .103 2, .116 3, .239 3, .343 5, .429 5, .509 
20 5, .095 eee Wal 5, .147 5, .175 5, .429 5, 601 10, .720 10, .809 
30 6, .109 6, .144 6, .181 6, .218 10, .578 10, .784 10, .884 10, .934 
40 8, .122 10, .167 10, .215 10, .265 13, .677 13, .870 20, .945 20, .978 
50 10, .134 10, .186 10, .242 10, .300 16, .755 16, .924 25, .977 25, .993 
60 15, .145 15, .207 15, .275 15, .344 20, .832 20, .963 20, .991 30, .998 
70 14, .155 14, .224 14, .298 17, .373 23, .875 23, .979 35, .996 35, .999 
&0 20, .166 20, .245 20, .330 20, .415 26, .907 26, .988 40, .999 24, 1.00 
90 18, .175 18, .260 22, .351 30, .442 30, .938 30, .994 45, .999 21, 1.00 
100 25, .186 25, .281 25, .381 25, .480 33, .955 33, .997 30, 1.00 20, 1.00 
500 125, .499 125, .749 125, .899 166, .967 27, 1.00 16, 1.00 13, 1.00 11, 1.00 
1000 250, .745 333, .944 237, 1.00 115, 1.00 27, 1.00 16, 1.00 13, 1.00 11, 1.00 
69 = 1.00. 
n\2 1.2 1.4 1.6 1.8 2.0 3.0 4.0 5.0 
10 2, .065 2, .081 2, .096 3, .112 3, .129 3, .209 3, .281 5, .350 
20 5, .076 5, .105 5, .137 5, .169 5, .203 5, .361 10, .491 10, .610 
30 10, .083 10, .123 10, .168 10, .216 10, .267 10, .501 10, .672 10, .784 
40 10, .090 10, .139 10, .195 10, .255 13, .317 13, .595 13, .771 20, .878 
50 12, .094 16, .152 16, .219 16, .29] 16, .365 16, .672 25, .848 25, .935 
60 15, .101 20, .170 20, .251 20, .338 20, .424 20, .756 20, .906 30, .966 
70 23, .106 23, .183 23, .274 23, .371 23, .466 23, .805 35, .938 35, .982 
80 20, .112 20, .196 26, .296 26, .402 26, .505 26, .845 40, .961 40, .991 
90 30, .116 30, .212 30, .325 30, .442 30, .533 30, .887 45, .976 45, .996 
100 25, .121 33, .224 33, .346 33, .471 33, .587 33, .911 50, .985 50, .998 
500 166, .277 166, .625 166, .868 166, .967 166, .993 77, 1.00 25, 1.00 19, 1.00 
1000 333, .438 333, .867 333, .992 142, 1.00 100, 1.00 41, 1.00 25, 1.00 19, 1.00 


Statistical Tables and Charts 


Table VIII (continued) 


10 2.229 
20 4, .331 
30 3, 444 
40 4, .531 
50 5, .602 
60 5, .662 
70 7,711 
80 8, .753 
90 6, .791 
100 9, 821 
300 20, .994 
500 9, 1.00 


10 2, .188 
20 4, .258 
30 5, .316 
40 5, .363 
50 7, 406 
60 10, .448 
70 10, .487 
80 10, .520 
90 15, .550 
100 14, .577 
300 50, .892 
500 71, 975 


10 2; 
20 3; 
30 6, 
40 10, 
50 10, 
60 15, 
70 14, 
80 20, 
90 18, 
100 25, 
500 125, 
1000 250, 


10 3, 
20 35 
30 10, 
40 10, 
50 16, 
60 20, 
70 23, 
80 20, 
90 30, . 
100 33, 
500 166, 
1000 333, 


From R. S. Barcikowski, 


124 
141 
152 
162 
171 
.180 
.187 
194 


202 


.209 
.408 
580 


1. 


16, 
20, 
23; 
26, 
30, 
33, 
166, 
333, 


4 


149 
. 184 
211 
.232 
.253 
.275 
292 
.308 
.328 
343 
.749 
.928 


1.6 


Bi 

5, 
10, 
13, 
16, 
20, 
23; 
26, 
30, 


35,3 
929 


166, 


.173 
227 
272 
.305 
.340 
375 
402 
A27 
459 


482 


233, 1.00 


a= 0.10 
A) = 0.00 
0.8 1.0 
2, .430 2, .470 
5, .694 5, .755 
6, .830 6, .877 
8, .906 10, .941 
10, .948 10, .971 
12, .971 15, .986 
14, .984 14, .993 
16, .991 20, .997 
18, .995 18, .998 
20, .998 20, .999 
7, 1.00 6, 1.00 
6, 1.00 6, 1.00 
Ay = 0.10 
0.9 1.0 
3, .360 3, 385 
5, 592 5, .628 
6, .720 6, .754 
10, .816 10, .850 
10, .873 10, .900 
15, 918 15, .940 
14, .943 14, .960 
20, .964 20, .977 
18, .975 22, .984 
25, .984 25, .991 
23, 1.00 21, 1.00 
12, 1.00 11, 1.00 
Ao = 0.50 
1.0 2.0 
3, .196 3, .355 
5, .276 5, 551 
10, .333 10, .700 
10, .386 13, .784 
12, .423 16, .845 
15, .474 20, .901 
23, .507 23, .930 
20, .548 26, .950 
30, .580 30, .969 
25, 612 33, .978 
166, .985 23, 1.00 
93, 1.00 23, 1.00 
89 = 1.00 
1.8 2.0 
3, .196 3, .219 
5, .269 5, 310 
10, .333 10, .392 
13, .379 13, .449 
16, .426 16, .506 
20, .472 20, .561 
23, .507 23, .602 
26, .540 26, .639 
30, .580 30, .683 
33, .608 33, .713 
166, .985 166, .997 
116, 1.00 82, 1.00 


3.0 


5, .737 
5, .944 
10, .992 
13, .998 
16, 1.00 
10, 1.00 
9, 1.00 
8, 1.00 
8, 1.00 
7, 1.00 
5, 1.00 
5, 1.00 


633 


4.0 


5, .808 
10, .972 
10, .997 
13, 1.00 
10, 1.00 

8, 1.00 

7, 1.00 

7, 1.00 

6, 1.00 

6, 1.00 

5, 1.00 

5, 1.00 


4.0 


5, .758 
10, .953 
10, .992 
20, .998 
25, 1.00 
12, 1.00 
10, 1.00 
10, 1.00 

9, 1.00 

9, 1.00 

7, 1.00 

7, 1.00 


5.0 


5, .657 
10, .892 
15, .968 
20, .991 
25, .997 
30, .999 
23, 1.00 
20, 1.00 
18, 1.00 
20, 1.00 
10, 1.00 
10, 1.00 


5.0 


5, .502 
10, .743 
15, .871 
20, .936 
25, .969 
30, .985 
35, .993 
40, .997 
45, .999 
50, .999 
17, 1.00 
17, 1.00 


“Optimum Sample Size and Number of Levels in a 


One-Way Random Effects Analysis of Variance,” The Journal of Experimental 
Education, 41 (1973), 10-16. Reprinted by permission. 


634 The Analysis of Variance 


Table IX. Minimum Sample Size per Treatment Group Needed 
for a Given Value of p, a, 1 — @, and Effect Size (C) in Sigma 
Units 


This table gives the minimum sample size per treatment group needed in 
the one-way fixed effects analysis of variance design corresponding to a = 
0.10, 0.05, 0.01; 1— B = 0.7, 0.8, 0.9, 0.95; C = A/o, = 1.0 (0.25) 2 (0.5) 3; 
and p = 2 (1) 11, 13. Here, A designates the magnitude of the difference be- 
tween any pair of treatment groups that is meaningful to detect with probability 
of at least 1 — 8. For example, in a one-way fixed effects analysis of variance 
design, for p = 3,a@ = 0.05, 1 — B = 0.8, and C = 1.0, the required sample 
size per treatment group is 21. 


1—B = 0.70 1—B = 0.80 
ra C 
p a 1.00 1.25 1.50 1.75 2.00 2.50 3.00 1.00 1.25 1.50 1.75 2.00 2.50 3.00 
2 10 U7 6 4 4 3.3 49 7 5 4 3 3 
0 4 9 #7 6 5 4 3 7 12 9 7 6 4 4 
Ol 21 15 11 9 7 5 5 2% 17 «+13 ~~ «10 8 6 5 
3 10 13 9 5 4 3 3 7 11 8 5 4 3 
0 17 #11 8 7 5 4 3 21 14 10 8 6 5 4 
01 25 17 «+12 «10 8 6 § 30 200«d@#sssa 9 7 5 
4 10 15 100 #7 6 5 4 3 9 13 #9 7 6 4 3 
0 19 13 9 7 6 4 4 23 #15 7 #5 4 
01 28 19 13 =~ «10 8 6 5 33 22 «16 «120«21002=COeTsti‘CS 
5 10 17 8 6 5 4 3 21 14 10 8 6 4 4 
0 21 14 «+10 ~~ 8 6 5 4 2 17 «12 7 5 4 
01 30 2 14 «11 9 6 5 35 23 #17 «+13 «10 =#«7~ «6 
6 10 18 12 9 7 5 4 3 22 #15 «ol 8 5 4 
0 22 #15 #11 8 7 5 4 27 «18 «#613~—«10 8 6 4 
01 32 21 #15 #12 9 #7 = § 33 25 #18 13 «+ 8 6 
7 10 19 13 9 7 6 4. 3 24 16 UI 9 5 4 
0 24 16 11 9 7 5 4 2 «+19 14 10 8 6 5 
01 34 22 #1 #12 «10 #7 5 39 2 «18 «#614061 i BkttC 
8 10 20 13 #10 #7 6 4 3 25 16 12 9 7 5 4 
0 2 146 #12 #9 #7 +5 4 30 20 0«14@ ss 6 5 
Ol 35 2 17 «+13 #10 #7 ~~ 5 41 27 19 15 12 8 6 
9 10 21 14 #10 8 4 4 2% 17 12 9 7 5 4 
0 2 17° «12 8 5 4 3121 0«215~=CsoFT 6 5 
Ol 37 #24 #17 ~«130«61000¢~«Sé~C«CS 43 28 #20 15 12 8 6 
10 10 22 14 «+10 ~~ 8 5 4 27 «+18 #13 ~©10 8 5 4 
05 27 +18 «13 «10 8 6 4 33 21 «15012 6 5 
01 38 2 18 #14 #1 7+ 6 4429 21 #16 #12 «8 «6 
11 10 23 #15 8 5 4 28 #18 +13 10 8 6 4 
05 28 19 413 ~~ «10 6 4 a4 22> 62 7 5 
01 39 2 18 #14 «11° °=«8~«6 46 30 21 #16 #13 9 ~=«7 
13.10 24 16 £11 9 7 5 4 30 200«dAsiaa 8 6 4 
0 30 2 #14 «1 9 6 5 36 240=C«=«aITs'sia2Bs’—iad‘Os—issCS 
01 42 27 #19 #15 #12 48 6 49 32 2 #17 #13 9 ~=«~7 


Statistical Tables and Charts 635 


Table IX (continued) 


1—B = 0.90 1—68 = 0.95 
C Cc 
p a 1.00 1.25 1.50 1.75 2.00 2.50 3.00 1.00 1.25 1.50 1.75 2.00 2.50 3.00 


01 52 34 24 18 14 10 60 39 28 21 16 11 
10.10 3523 16 12 10 7 42 27 19 15 11 8 
05 41 27 19 14 1] 8 48 31 a2 17 13 9 
01 54 35 25 19 15 10 62 40 29 21 17 11 
11 .10 360-23 17 13 10 7 43 28 20 15 12 8 
05 42 28 20 15 12 8 50 33 23 17 14 9 
01 55 36 26 19 15 10 64 42 29 22 17 12 


46 30 21 16 12 8 
53 34 24 18 14 10 
68 44 31 23 18 12 


13.10 38 = 25 18 13 11 7 
05 45 29 21 16 12 8 
01 59-38 27 20 16 11 


2 «10 18 12 9 7 6 4 3 23 15 11 8 7 5 4 
.05 23° «15 11 8 7 5 4 27~—s ‘18 13 10 8 6 5 
01 32.2] 15 es 10 7 6 38 = 25 18 14 11 8 6 

3.10 22 15 1] 8 7 5 4 27 —s 18 13 10 8 6 4 
05 27 ~=—s 18 13 10 8 6 5 32;. -2) 15 12 9 7 5 
01 37 = 24 18 13 11 8 6 43 29 20 16 12 9 7 

4 .10 25 =16 12 9 7 5 4 30 = 20 14 11 9 6 5 
.O5 30 = 20 14 11 9 6 5 36 0— 23 17 13 10 7 5 
01 40 27 19 15 12 8 6 47 31 22 17 13 9 7 

5 .10 27 = «18 13 10 8 5 4 oo) 15 12 9 6 5 
05 32.21 15 12 9 6 5 39-25 18 14 11 7 6 
01 43 28 20 15 12 9 7 51 33 23 18 14 10 7 

6 .10 29 «19 14 10 8 6 4 35,23 16 12 10 7 5 
05 34-23 16 12 10 i 5 41 27 19 14 11 8 6 
O01 46 30 21 16 13 9 7 53-35 25 19 15 10 8 

7 ~~ 10 31 = =20 14 11 9 6 5 37 24 17 13 10 5 
05 36 24 17 13 10 7 5 43 28 20 15 12 8 6 
01 48 31 22 17 13 9 7 56 36 26 19 15 10 +8 

8  =.10 32° 2) 15 11 9 6 5 39,25 18 14 11 7 5 
05 38 = 25 18 13 1] 7 6 45 29 21 16 12 8 6 
O01 50 =. 33 23 17 14 9 7 58 =. 38 27 20 16 ll 8 

9  .10 33, 22 16 12 9 6 5 40 26 19 14 11 8 6 
05 40 26 18 14 11 8 6 47 30 22 16 13 9 6 

7 8 
5 6 
6 7 
a 8 
5 6 
6 7 
8 9 
5 6 
6 7 
8 9 


From T. L. Bratcher, A. M. Moran, and W. J. Zimmer, “Tables of Sample Sizes 
in the Analysis of Variance,” Journal of Quality Technology, 2 (1970), 156— 
164. Abridged and adapted by permission. The adaptation is due to R. E. Kirk, 
Experimental Design, Third Edition, © 1995 by Brooks/Cole, Monterey, CA. 


636 The Analysis of Variance 


Table X. Critical Values of the Studentized Range Distribution 


This table gives the critical values of the Studentized range distribution used in 
multiple comparisons. The critical values are designated as g[p, v; 1 — a] cor- 
responding to a given value of a, p as the total number of treatment groups or 7 
as the number of steps between ordered means, and v as the number of degrees 
of freedom for the error. The critical values are given for wa = 0.05, 0.01; p = 
2 (1) 20; and v = 2 (1) 20, 24, 30, 40, 60, 120, co. For example, fora = 
0.05, p = 4, and v = 20, the required critical value is g [4, 20; 0.95] = 3.96. 


Statistical Tables and Charts 


120 


2 
6.08 
14.00 


4.50 
8.26 


3.93 
6.51 


3.64 
5.70 


3.46 
5.24 


3.34 
4.95 


3.26 
4.75 


3.20 
4.60 


3.15 
4.48 


3.11 
4.39 


3.08 
4.32 


3.06 
4.26 


3.03 
4.21 


3.01 
4.17 


3.00 
4.13 


2.98 
4.10 


2.97 
4.07 


2.96 
4.05 


2.95 
4.02 


2.92 
3.96 


2.89 
3.89 


2.86 
3.82 


2.83 
3.76 


2.80 
3.70 


2.77 
3.64 


Number of Means (p) or Number of Steps Between Ordered Means (r) 


3 
8.33 
19.00 


5.91 
10.60 


5.04 
8.12 


4.60 
6.98 


4.34 
6.33 


4.16 
5.92 


4.04 
5.64 


3.95 
5.43 


3.88 
5.27 


3.82 
5.15 


3.77 
5.05 


oa 
4.96 


3.70 
4.89 


3.67 
4.84 


3.65 
4.79 


3.63 
4.74 


3.61 
4.70 


3.59 
4.67 


3.58 
4.64 


3.53 
4.55 


3.49 
4.45 


3.44 


4.37 


3.40 
4.28 


3.36 
4.20 


3.31 
4.12 


4 
9.80 
22.30 


6.82 
12.20 


5.76 
9.17 


3-22 
7.80 


4.90 
7.03 


4.68 
6.54 


4.53 
6.20 


4.41 
5.96 


4.33 
5.77 


4.26 
5.62 


4.20 
5.50 


4.15 
5.40 


4.11 
32 


4.08 
5.25 


4.05 
5.19 


4.02 
5.14 


4.00 
5.09 


3.98 
5.05 


3.96 
5.02 


3.90 
4.91 


3.85 
4.80 


3.79 
4.70 


3.74 
4.59 


3.68 
4.50 


3.63 
4.40 


5 
10.90 
24.70 


7.50 
13.30 


6.29 
9.96 


5.67 
8.42 


5.30 
7.56 


5.06 
7.01 


4.89 
6.62 


4.76 
6.35 


4.65 
6.14 


4.57 
5.97 


4.51 
5.84 


4.45 
5.73 


4.41 
5.63 


4.37 
5.56 


4.33 
5.49 


4.30 
5.43 


4.28 
5.38 


4.25 
5.33 


4.23 
5.29 


4.17 
5.17 


4.10 
5.05 


4.04 
4.93 


3.98 
4.82 


3.92 
4.71 


3.86 
4.60 


6 
11.70 
26.60 


8.04 
14.20 


6.71 
10.60 


6.03 
8.9] 


5.63 
7.97 


5.36 
7.37 


5.17 
6.96 


5.02 
6.66 


4.91 
6.43 


4.82 
6.25 


4.75 
6.10 


4.69 
5.98 


4.64 
5.88 


4.59 
5.80 


4.56 
5.72 


4.52 
5.66 


4.49 
5.60 


4.47 
5.55 


4.45 
5.51 


4.37 
5.37 


4.30 
5.24 


4.23 
5.11 


4.16 
4.99 


4.10 
4.87 


4.03 
4.76 


7 
12.40 
28.20 


8.48 
15.00 


7.05 
11.10 


6.33 
9.32 


5.90 
8.32 


5.61 
7.68 


5.40 
7.24 


5.24 
6.91 


5.12 
6.67 


5.03 
6.48 


4.95 
6.32 


4.88 
6.19 


4.83 
6.08 


4.78 
5.99 


4.74 
5.92 


4.70 
5.85 


4.67 
5.79 


4.65 
5.73 


4.62 
5.69 


4.54 
5.54 


4.46 
5.40 


4.39 
5.26 


4.31 
5.13 


4.24 
5.01 


4.17 
4.88 


8 
13.00 
29.50 


8.85 
15.60 


7.35 
11.50 


6.58 
9.67 


6.12 
8.61 


5.82 
7.94 


5.60 
7.47 


5.43 
7.13 


5.30 
6.87 


5.20 
6.67 


S12 
6.51 


5.05 
6.37 


4.99 
6.26 


4.94 
6.16 


4.90 
6.08 


4.86 
6.01 


4.82 
5.94 


4.79 
5.89 


4.77 
5.84 


4.68 
5.69 


4.60 
5.54 


4.52 
5.39 


4.44 
5.25 


4.36 
5.12 


4.29 
4.99 


9 
13.50 
30.70 


9.18 
16.20 


7.60 
11.90 


6.80 
9.97 


6.32 
8.87 


6.00 
8.17 


5.77 
7.68 


5.59 
7.33 


5.46 
7.05 


5.35 
6.84 


5.27 
6.67 


5.19 
6.53 


5.13 
6.41 


5.08 
6.31 


5.03 
6.22 


4.99 
6.15 


4.96 
6.08 


4.92 
6.02 


4.90 
5,97 


4.81 
5.81 


4.72 
5.65 


4.63 
5.50 


4.55 
5.36 


4.47 
5.21 


4.39 
5.08 


10 
14.00 
31.70 


9.46 
16.70 


7.83 
12.30 


6.99 
10.24 


6.49 
9.10 


6.16 
8.37 


5.92 
7.86 


5.74 
7.49 


5.60 
7.21 


5.49 
6.99 


5.39 
6.81 


5.32 
6.67 


5.25 
6.54 


5.20 
6.44 


5.15 
6.35 


5.11 
6.27 


5.07 
6.20 


5.04 
6.14 


5.01 
6.09 


4.92 
5.92 


4.82 
5.76 


4.73 
5.60 


4.65 
5.45 


4.56 
5.30 


4.47 
5.16 


637 


11 
14.40 
32.60 


972 
17.80 


8.03 
12.60 


7.17 
10.48 


6.65 
9.30 


6.30 
8.55 


6.05 
8.03 


5.87 
7.65 


5.72 
7.36 


5.61 
7.13 


5.51 
6.94 


5.43 
6.79 


5.36 
6.66 


5.31 
6.55 


5.26 
6.46 


5.21 
6.38 


5.17 
6.31 


5.14 
6.25 


5.11 
6.19 


5.01 
6.02 


4.92 
5.85 


4.82 
5.69 


4.73 
5.53 


4.64 
5.37 


4.55 
5.23 


638 


Table X (continued) 


Vv a 
2 05 
01 

3 05 
01 

4 05 
01 

5 05 
01 

6 05 
01 

7 05 
01 

8 05 
01 

9 05 
01 

10 05 
01 

11 05 
01 

12 05 
01 

13 0S 
O01 

14 05 
01 

15 05 
01 

16 OS 
01 

17 .O5 
01 

18 05 
01 

19 05 
01 

20 OS 
01 

24 Ab) 
01 

30 05 
01 

40 05 
01 

60 05 
01 

120 05 
01 

CO 05 
01 


12 
14.70 
33.40 


9.72 
17.50 


8.21 
12.80 


7.32 
10.70 


6.79 
9.48 


6.43 
8.71 


6.18 
8.18 


5.98 
7.78 


5.83 
7.49 


5.71 
7.25 


5.61 
7.06 


5.53 
6.90 


5.46 
6.77 


5.40 
6.66 


5.35 
6.56 
5.31 
6.48 


5.27 
6.41 


323 
6.34 


5.20 
6.28 


5.10 
6.11 


5.00 
5.93 


4.90 
5.76 


4.81 
5.60 


4.71 
5.44 


4.62 
5.29 


Number of Means (p) or Number of Steps Between Ordered Means (r) 


13 
15.10 
34.10 


10.20 
17.90 


8.37 
13.10 


7.47 
10.89 


6.92 
9.65 


6.55 
8.86 


6.29 
8.31 


6.09 
7.91 


5.93 
7.60 


5.81 
7.36 


5.71 
7.17 


5.63 
7.01 


5.55 
6.87 


5.49 
6.76 


5.44 
6.66 


5.39 
6.57 


5.35 
6.50 


5.31 
6.43 


5.28 
6.37 


5.18 
6.19 


5.08 
6.01 


4.98 
5.83 


4.88 
5.67 


4.78 
5.50 


4.68 
5.35 


14 
15.40 
34.80 


10.30 
18.20 


8.52 
13.30 


7.60 
11.08 


7.03 
9.81 


6.66 
9.00 


6.39 
8.44 


6.19 
8.03 


6.03 
7.71 


5.90 
7.46 


5.80 
7.26 


5.71 
7.10 


5.64 
6.96 
5.57 
6.84 


52 
6.74 


5.47 
6.66 


5.43 
6.58 


5.39 
6.51 


5.36 
6.45 


5.25 
6.26 


5.15 
6.08 


5.04 
5.90 


4.94 
5.73 


4.84 
5.56 


4.74 
5.40 


15 
15.70 
35.40 


10.50 
18.50 
8.66 
13.50 
7.72 
11.24 


7.14 
9.95 


6.76 
9.12 


6.48 
8.55 


6.28 
8.13 


6.11 
7.81 


5.98 
7.56 


5.88 
7.36 


3219 
7.19 


5.71 
7.05 


5.65 
6.93 


5.59 
6.82 


5.54 
6.73 


5.50 
6.65 


5.46 
6.58 


5.43 
6.52 


5.32 
6.33 


5.21 
6.14 


5.11 
5.96 


5.00 
5.78 


4.90 
5.61 


4.80 
5.45 


16 
15.90 
36.00 


10.70 
18.80 


8.79 
13.70 


7.83 
11.40 


7.24 
10.08 
6.85 
9.24 


6.57 
8.66 


6.36 
8.23 


6.19 
7.91 


6.06 
7.65 


5.95 
7.44 


5.86 
7.27 


5.79 
7.13 


5.72 
7.00 


5.66 
6.90 


5.61 
6.81 


5.57 
6.73 


5.53 
6.65 


5.49 
6.59 


5.38 
6.39 


5.27 
6.20 


5.16 
6.02 


5.06 
5.84 


4.95 
5.66 


4.85 
5.49 


17 
16.10 
36.50 


10.80 
19.10 


8.91 
13.90 


7.93 
11.55 


7.34 
10.21 


6.94 
9.35 


6.65 
8.76 


6.44 
8.33 


6.27 
7.99 


6.13 
7.73 


6.02 
7.52 
5.93 
7.35 


5.85 
7.20 


5.78 
7.07 


5.73 
6.97 


5.67 
6.87 


5.63 
6.79 


5.59 
6.72 


5.55 
6.65 


5.44 
6.45 


5.33 
6.26 


5.22 
6.07 


5.11 
5.89 


5.00 
5.71 


4.89 
5.54 


The Analysis of Variance 


18 
16.40 
37.00 


11.00 
19.30 


9.03 
14.10 


8.03 
11.68 


7.43 
10.32 


7.02 
9.46 


6.73 
8.85 


6.51 
8.41 


6.34 
8.08 


6.20 
7.81 


6.09 
7.59 


5.99 
7.42 


5.91 
7.27 


5.85 
7.14 


5.79 
7.03 


513 
6.94 


5.69 
6.85 


5.65 
6.78 


5.61 
6.71 
5.49 
6.51 


5.38 
6.31 


5.27 
6.12 


5.15 
5.93 


5.04 
5.75 


4.93 
5.57 


19 
16.60 
37.50 


11.10 
19.50 


9.13 
14.20 


8.12 
11.81 


7.51 
10.43 


7.10 
9.55 


6.80 
8.94 


6.58 
8.49 


6.40 
8.15 


6.27 
7.88 


6.15 
7.66 
6.05 
7.48 
5.97 
7.33 
5.90 
7.20 


5.84 
7.09 


5.79 
7.00 


5.74 
6.91 


5.70 
6.84 


5.66 
6.77 


5.55 
6.56 


5.43 
6.36 


5.31 
6.16 


5.20 
5.97 


5.09 
5.79 


4.97 
5.61 


20 
16.80 
37.90 


11.20 
19.80 


9.23 
14.40 


8.21 
11.93 


7.59 
10.54 


7.17 
9.65 


6.87 
9.03 


6.64 
8.57 


6.47 
8.23 


6.33 
7.95 


6.21 
7.73 


6.11 
135 


6.03 
7.39 
5.96 
7.26 


5.90 
7.15 


5.84 
7.05 


5.79 
6.97 


5.75 
6.89 


5.71 
6.82 


Di? 
6.61 


5.47 
6.41 


5.36 
6.21 


5.24 
6.01 


5.13 
5.83 


5.01 
5.65 


From E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, Vol. I, 
Third Edition, © 1970 by Cambridge University Press, Cambridge. Abridged 
and adapted by permission (from Table 29). 


Statistical Tables and Charts 639 


Table XI. Critical Values of the Dunnett’s Test 


This table gives the critical values of the Dunnett’s test used in comparing 
all treatment means to a control mean. The critical values are designated as 
D [p, v; 1 — a] corresponding to a given value of a, p as the number of treat- 
ment groups excluding the control, and v as the number of degrees of free- 
dom for the error. The critical values are given for one- and two-tailed tests at 
a = 0.05,0.01, p = 1 (1) 9; andv = 5 (1) 20, 24, 30, 40, 60, co. When the 
researcher is comparing all treatment means to a control, the question often 
is whether the treatment is better than the control. In this situation, one-tailed 
critical values should be used. If the researcher wants to test whether the treat- 
ment means are simply different from the control, in either direction, two-tailed 
critical values are more appropriate. 


640 The Analysis of Variance 


One-Tailed Comparison 
Number of Treatment Means, Excluding the Control (p) 


V a 1 2 3 4 5 6 7 8 9 


5 05 2.02 2.44 2.68 2.85 2.98 3.08 3.16 3.24 3.30 
01 3.37 3.90 4.21 4.43 4.60 4.73 4.85 4.94 5.03 

6 05 1.94 2.34 2.56 2.71 2.83 2.92 3.00 3.07 3.12 
01 3.14 3.61 3.88 4.07 4.21 4.33 4.43 4.51 4.59 

7 05 1.89 2.20 2.48 2.62 2.73 2.82 2.89 2.95 3.01 
01 3.00 3.42 3.66 3.83 3.96 4.07 4.15 4.23 4.30 

8 05 1.86 2.22 2.42 2.55 2.66 2.74 2.81 2.87 2.92 
01 2.90 3.29 3.51 3.67 3.79 3.88 3.96 4.03 4.09 

9 .05 1.83 2.18 2.37 2.50 2.60 2.68 2.75 2.81 2.86 
01 2.82 3.19 3.40 3.55 3.66 3.75 3.82 3.89 3.94 

10 05 1.81 2.15 2.34 2.47 2.56 2.64 2.70 2.76 2.81 
01 2.76 3.11 3.31 3.45 3.56 3.64 3.71 3.78 3.83 

11 05 1.80 2.13 2.31 2.44 2.53 2.60 2.67 2AL 2.77 
01 2.12 3.06 3.25 3.38 3.48 3.56 3.63 3.69 3.74 

12 .O5 1.78 2.11 2.29 2.41 2.50 2.58 2.64 2.69 2.74 
01 2.68 3.01 3.19 3:32 3.42 3.50 3.56 3.62 3.67 

13 05 1.77 2.09 22) 2.39 2.48 2.55 2.61 2.66 2.71 
01 2.65 2.97 3.15 3.27 3.37 3.44 3.5] 3.56 3.61 

14 05 1.76 2.08 2.25 PF | 2.46 2.53 2.59 2.64 2.69 
01 2.62 2.94 3.11 3.23 3.32 3.40 3.46 3.51 3.56 

15 05 1.75 2.07 2.24 2.36 2.44 2.51 2.57 2.62 2.67 
01 2.60 2.91 3.08 3.20 3.29 3.36 3.42 3.47 3.52 

16 05 1.75 2.06 2.23 2.34 2.43 2.50 2.56 2.61 2.65 
.O1 2.58 2.88 3.05 3.17 3.26 3.33 3.39 3.44 3.48 

17 05 1.74 2.05 222 2.33 2.42 2.49 2.54 259 2.64 
01 2.57 2.86 3.03 3.14 3.23 3.30 3.36 3.41 3.45 

18 05 1.73 2.05 pee | 2.32 2.41 2.48 2.53 2.58 2.62 
01 2.55 2.84 3.01 3.12 3.21 3.27 3.33 3.38 3.42 

19 05 1.73 2.03 2.20 2.31 2.40 2.47 2.52 2.57 2.61 
01 2.54 2.83 2.99 3.10 3.18 3.25 3.31 3.36 3.40 

20 .05 1.72 2.03 2.19 2.30 2.39 2.46 2.51 2.56 2.60 
01 235 2.81 2.97 3.08 3.17 3.23 3.29 3.34 3.38 

24 .O5 1.71 2.01 2.17 2.28 2.36 2.43 2.48 2.53 2.57 
01 2.49 2.77 2.92 3.03 3.11 3.17 3.22 3.27 3.31 

30 05 1.70 1.99 2.15 25 2.33 2.40 2.45 2.50 2.54 
.O1 2.46 212 2.87 2.97 3.05 3.11 3.16 3.21 3.24 

40 05 1.68 1.97 2.13 2.23 2.31 2.37 2.42 2.47 2.51 
.O1 2.42 2.68 2.82 2.92 2.99 3.05 3.10 3.14 3.18 

60 05 1.67 1.95 2.10 2.21 2.28 2.35 2.39 2.44 2.48 
01 2.39 2.64 2.78 2.87 2.94 3.00 3.04 3.08 3.12 

120 .O5 1.66 1.93 2.08 2.18 2.26 2.32 2.37 2.41 2.45 


01 2.36 2.60 2.73 2.82 2.89 2.94 2.99 3.03 3.06 


oe) 05 1.64 1.92 2.06 2.16 2:23 220 2.34 2.38 2.42 
01 2.33 2.56 2.68 2.77 2.84 2.89 2.93 2.97 3.00 


Statistical Tables and Charts 


Table XI (continued) 


Vv a 1 
5 05 2.57 
01 4.03 
6 05 2.45 
01 3.71 
7 05 2.36 
01 3.50 
8 05 2.31 
01 3.36 
9 05 2.26 
01 3.25 
10 05 2.23 
01 3.17 
11 05 2.20 
01 3.11 
12 05 2.18 
01 3.05 
13 05 2.16 
01 3.01 
14 05 2.14 
01 2.98 
15 05 2.13 
01 2.95 
16 05 ZZ 
01 2.92 
17 05 2.11 
01 2.90 
18 05 2.10 
01 2.88 
19 05 2.09 
01 2.86 
20 05 2.09 
01 2.85 
24 05 2.06 
01 2.80 
30 05 2.04 
01 2.75 
40 05 2.02 
01 2.70 
60 05 2.00 
01 2.66 
120 05 1.98 
01 2.62 
e.8) .05 1.96 
01 2.58 


2 


3.03 
4.63 


2.86 
4.21 


2.75 
3.95 
2.67 
3.77 


2.61 
3.63 
2.57 
3.53 
2.53 
3.45 
2.50 
3.39 
2.48 
3.33 
2.46 
3.29 
2.44 
3.25 
2.42 
3.22 
2.41 
3.19 


2.40 
3.17 


2.39 
3.15 


2.38 
3.13 
2.35 
3.07 


2.52, 
3.01 


229 
ZdD 
2.27 
2.90 
2.24 
2.85 


2.21 
2.79 


Number of Treatment Means, Excluding the Control (p) 


3 


3.29 
4.98 


3.10 
4.51 


2.97 
4.21 
2.88 
4.00 


2.81 
3.85 
2.76 
3.74 
212 
3.65 
2.68 
3.58 
2.65 
3.52 
2.63 
3.47 
2.61 
3.43 


2.59 
3.39 


2.58 
3.36 
2.56 
3.33 
2.55 
3.31 
2.54 
3.29 


2.51 
3.22 
2.47 
3.15 
2.44 
3.09 
2.41 
3.03 
2.38 
2.97 


2.35 
2.92 


Two-Tailed Comparison 


4 


3.48 
5:22 


3.26 
4.71 


3.12 
4.39 
3.02 
4.17 


2.95 
4.01 


2.89 
3.88 


2.84 
3.79 
2.81 
Beal 


2.78 
3.65 


2.75 
3.59 
2.73 
3.55 
2.71 
3.51 


2.69 
3.47 


2.68 
3.44 
2.66 
3.42 


2.65 
3.40 


2.61 
S32 
2.58 
3.25 


2.54 
3.19 
2.51 
3:12 
2.47 
3.06 
2.44 
3.00 


5 


3.62 
5.41 


3.39 
4.87 


3.24 
4.53 
3.13 
4.29 


3.05 
4.12 


2.99 
3.99 


2.94 
3.89 
2.90 
3.81 
2.87 
3.74 
2.84 
3.69 
2.82 
3.64 
2.80 
3.60 
2.78 
3.56 
2.76 
3.53 


2.75 
3.50 


213 
3.48 
2.70 
3.40 
2.66 
3.33 


2.62 
3.26 
2.58 
3.19 
2.55 
3.12 


2.51 
3.06 


6 


3.73 
5.56 


3.49 
5.00 


3.33 
4.64 
3.22 
4.40 


3.14 
4.22 


3.07 
4.08 
3.02 
3.98 
2.98 
3.89 
2.94 
3.82 
2.91 
3.76 
2.89 
3.71 
2.87 
3.67 
2.85 
3.63 
2.83 
3.60 
2.81 
3.57 
2.80 
3.55 


2.76 
3.47 


2.72 
3.39 


2.68 
3.32 
2.64 
3.25 
2.60 
3.18 


2.57 
3.11 


7 


3.82 
5.69 


3.57 
5.10 


3.41 
4.74 
3.29 
4.48 


3.20 
4.30 
3.14 
4.16 


3.08 
4.05 
3.04 
3.96 
3.00 
3.89 
2.97 
3.83 
2.95 
3.78 
2.92 
3.73 


2.90 
3.69 


2.89 
3.66 
2.87 
3.63 
2.86 
3.60 
2.81 
o.92 
2.77 
3.44 


2.73 
3.37 
2.69 
3.29 


2.65 
3.22 


2.61 
3.15 


8 


3.90 
5.80 


3.64 
5.20 


3.47 
4.82 
3.35 
4.56 


3.26 
4.37 


3.19 
4.22 


3.14 
4.11 


3.09 
4.02 


3.06 
3.94 
3.02 
3.88 
3.00 
3.83 


29) 
3.78 


2.95 
3.74 
2.94 
3.71 
2.92 
3.68 
2.90 
3.65 
2.86 
3.57 
2.82 
3.49 
2.77 
3.41 


2.73 
3.33 


2.69 
3.26 


2.65 
3.19 


641 


9 


3.97 
5.89 


3.71 
5.28 


3.53 
4.89 
3.41 
4.62 


3.32 
4.43 


3.24 
4.28 


3.19 
4.16 
3.14 
4.07 
3.10 
3.99 
3.07 
3.93 
3.04 
3.88 
3.02 
3.83 
3.00 
3.79 
2.98 
3.75 


2.96 
3.72 


2.95 
3.69 
2.90 
3.61 
2.86 
3.52 
2.81 
3.44 
2.77 
3.37 
pa 
3.29 


2.69 
3.22 


From C. W. Dunnett, “A Multiple Comparison Procedure for Comparing Several 
Treatments with a Control,” Journal of the American Statistical Association, 50 
(1955), 1096-1121 and C. W. Dunnett, “New Tables for Multiple Comparisons 
with a Control,” Biometrics, 20 (1964), 482—491. Reprinted by permission. 


642 The Analysis of Variance 


Table XII. Critical Values of the Duncan’s Multiple Range Test 


This table gives the critical values of Duncan’s multiple range test which uses 
protection level @ for the collection of all tests. The critical values are designated 
as R[r, v; 1 — a] corresponding to a given level a, the number of means for 
the range being tested or the number of steps apart of two means 1n an ordered 
sequence (r), and the number of degrees of freedom for the error (v). The critical 
values are given for a = 0.05, 0.01;r =2 (1) 10 (2) 20, 50,100; and v= 1 (1) 
20 (2) 30, 40, 60, 100, oo. For example, for a = 0.01, r =3, and v = 13, the 
required critical value is obtained as R[3, 13; 0.99] = 4.48. 


643 


Statistical Tables and Charts 


00'S 00'S 00°S 66'P L6v v6'p 06'P v8" I8'P LLb CL Y v9'P BSP OS'P Ley LIv 
Lye Lye Lye Lv'e Ort crt pre tre Cre Or'¢ BEE 9C'€ Ite Ste Olt IO'e 
LOS LOS LOS 90°¢ bO'S 00'S 96'P l6'P L8'p tsp 8LP OL TP o'r oop An4 It P 
Lyte Lye Lye Lyte Ort Ort ort rt Che Ive 6t'€ Lee tect Loe Blt tO'e 
a BS gis SI'S vis tls 80'S O'S 867 b6'p 88'P v8'P bly 69° CoP 8h P 90 
Lot LV Ee Lye Lyte Ort OV'¢ crt srt pyre cht Ip’ BEE Stet Ot’¢ Ite 90° 
9¢°S 9¢°¢ 97'S ves ccs LIS tls LOS cO'S 96'P c6P v8 OLY 89'P csT cev 
Bre 8r'¢ 8rt LY'e Ort Ort Ort Ort vye vee cht Ore OE e tee ect 80't 
6t'S 6c'¢ 6t S 8E'S pes 87'S ves SIs cls 90'S 10'S v6'v 98'P LLY coPV 6t P 
8r't 8rt Bre Lyte Ort Ort Ort Ort Ort 5 bre tre 6t'€ Set Lote IT'€ 
gos ccs ccs pss srs cys 9e°S 87'S ves 0¢c'S tis 90'S 96'1 88'P tly 8rP 
Bre Bre Bret Lyte Lyte Lye Lye Lye Lyte Lye Lyte Ort tre Let Otc cI't 
OLS OLS OLS OLS 09's OS's OS's Obs 9e°¢ ces Sos LIS 80'S 66'P 98° 09°P 
cSt (69 cSt cS'€ cSt cot cSt cSt (69 cSt cSt OS t Lyte Ive HEE Oct 
08'S 08'S 08'S 08'S OLS OL'S 09'S OS'S Iss Lys Ors ces tcs bis 00'S vl Pp 
9c't 9S't 9S't 9c'E 9C'f 9st OC'E Oct 9s'€ Oct 9S'€ Sa coe Lye 6t'€ 97'E 
00°9 00°9 00°9 00°9 06'S 06'S 08'S O8'S els 69'S [9°S sg crs Les cos S6Pv 
{9'¢ 19’ I9'€ 19° 19° 19't [9° 19°€ I9'€ I9'€ I9'¢€ 09'¢ Bot pst Lye cee 
0t°9 0t'9 0£9 0t'9 0¢'9 079 O19 00°9 00°9 S6¢ 88'S [8's tls go's [g's ves 
89'E 89'E 89't 89'E 89't 89t 89'E 89't 89't 89t 89't 89° 89° pO'e 8S t Ort 
08°9 089 08°9 OL9 OL'9 09°9 09°9 0s'9 vy9 Or'9 to9 909 819 119 96'S OLS 
ese tSe ese e8'°¢ eB ese tse ese e8'e tse tse ee cst 6L't pL’ p9O'' 
OSL OS'L OSL OSL Or'L Ov'L OL O€'L OCL OL OIL OL 00°L 06'9 089 1¢9 
cO'v COP COP cO'v cO'P cO'P cO'Y cO'P cO'P cO'v cO'P cO'P cO'v cO'P lO'P tot 
Ot '6 Ot 6 0£'6 Ot'6 0¢'6 O16 00°6 00°6 00°6 068 06°8 088 OL'8 09°8 0S'8 978 
OS'P OS" OS'P OS 'P OS’ OS’ OS 'P OS'P OS’ OS'P OS" OS 'P OS’ OS'P OS'P OS 'P 
00 FI 00'rI 00°V I 00'r I 00'r I 00'r! 00° I 00'rI 00°F I 00° I 00'TI 00'rI 00'r I 00'rI 00°F I 00°F I 
60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 60°9 
00°06 00°06 00°06 00°06 00°06 00°06 00'06 00°06 00°06 00°06 00°06 00°06 00°06 00°06 00°06 00°06 
00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 00°81 
00L 0s 02 BL 91 PL ch OL 6 8 Z 9 S v c ré 
(4) pajsay aduey 103 suvayy jo saquinN 


Sl 


vl 


tl 


cl 


| 


Ol 


The Analysis of Variance 


644 


‘uorsstuulod Aq payuliday ‘Zp—] “(SG6I) I] ‘sommoworg ,‘sisay 7 adninyjy pue osuey ojdnyny,, ‘ueoung ‘g ‘gq Wolly 


89'P 09°P Ip p 8E PV veV lt py 9b 7 LIP bly 60° b0'P 86¢ 06't O8't p9'€ 10° 

LOE 1I9'¢ Lyte pre Ipe BEE vee 67€ 97'€ ect 6I'€ ST¢e 60° cO'’ C67 LLZ@ SO oo 

ooh b9'P 8h P CrP cr P 8t'P cer 60 P 4 It p LI Il 90°F t6€ 98°C IL¢€ 10° 

tse toe Lyte srt cre Ore 9E't Cee 60 £ 9Ct COE Bie cle SOE S67 087 $0’ 00! 

99'P 99'P ESP OS’ Ly'P bey 6t'P vEP Ie py LOV tcv LID (an. tor COE OLE 10° 

8P't Bre Lyte sre tre Ort LA-t tee oe 87" VOL OCE pI’ 80° 867 t8'C SO’ 09 

69'b 69°P 6S'P LSD vS'P IS'p orb Ip vp LEV beD Ot PV vO LIv OI'P 66'¢ C8't 10° 

LY'C Lye Lyte Ort bre che 6t'te Ste tue Otte Loe CCE Lit Ole 1O'e 98°C SO’ Ov 

IL Pb IL'p o9o'P £9P IOP 8S'Pp bop SPP SS a Ip p 9D (6m 4 COP 9I'P 90'P 68°C [0° 

Lyte LVt Lyte Ort bre tre Ore Lee cet CEE 67'¢ STE Oct cle’ vO'e 68°C SO’ Ot 

cL YY CLY LO'P SOP c9o'b 09'r 9S'P IS'p LvP typ 6t 7 vey 8CP 8I'P 801 l6¢ 10° 

LVe LY’ Lye ort Spe tre Ort Lee Ste tet Ot'e 97'E Oct ele vO'e 06°C SO 8C 

tL p tLy 69°P Lov c9'P co'P 8S'P tsp OS'P OVP Iv'v i Oty IOP Ip toe 10° 

Lye Lye Lye Ort SPe tre Ive BEL 9C'€ vee Ore Loe Itt vie 90'€ 167 SO 97@ 

PLY bLY CL YY OLY Lob b9'P co'P LOY top 6b P br Pp 6t Py tev vO vIP 96'C 10° 

Lyte LY’ LV’ ort Sve pre Ip't BEE LEE bee lee 87 E CTE SI LOE C67 SO" VC 

SLY SLY SLY pLY ILD 89'P Sov 09° LS’ toy 8r'P cry 9C'P 87 P LI? 66'€ 10° 

Lye LY’ Lyte Ort cpt bre Che 6t't Eice See Cee 6C€ VTE LIE 80'¢ t67¢ SO’ (Zé 

6L'b 6L YP 6LY 8LP OLY tly 69'P cor IOP 8S'P tsp Lv'v OVP tov COP cO'Y 10° 

Lye LVe Lye Ort Ort bre tre Ort BEE 9E'€ pee Oe Sct 8It Ole S6C SO’ 0c 

C8 P C8'P C8 Vv ISP 6LP OLY cL Y LOY v9'P IOP 9S'P OS'P try to vO SOT 10° 

LVe Lye Lyte Lye Ort bre tre Ive 6tt bee See Ite 97 t 6l't Itt 967 SO’ 6l 

c8'P C8P S8 PV v8 YP C8'P 6L'b OLY IL Pb 89'P b9'P 6S P top oT 8C7 Lov LO'D 10’ 

LVe LVt LVt Lvt Ort crt tye Ip 6t'€ LEE Sct CEL LOE It € (ams L67 SO’ 81 

68'P 68'P 68 P 887 98'P tsp O8'P SLY CLY 89'P t9P 9S'P OS'P Iv'v OCP Oly 10° 

Lye Lv’ LVt Lye Ort Sve pre cre Ore BEE 9C'€ EDS BCE CCE tle 86°C SO’ LI 

vor v6'~ pov tov l6'P 887 8'P 6L OLY CLY LOY 09°P vo PV SVP ve PL tly 10° 

Lyte Lve Lyte Lye 9P't cre pre tre Ive 6et Lee be’ Oct CGE Sit 00° SO’ 91 

00L os 02 8h 91 vl ch OL 6 8 Z 9 S v € z 0 a 
(4) paysay asuey 410) sueap jo saquuinn 


(panuyjuo2) 1X a1qeL 


Statistical Tables and Charts 645 


Table XIII. Critical Values of the Bonferroni t Statistic and Dunn’s 
Multiple Comparison Test 


This table gives the critical values of the Bonferroni ¢ statistic and Dunn’s 
multiple comparison procedure. The critical values are given for a = 0.05, 0.01; 
the number of comparisons p = 1 (1) 10 (5) 20 and the error degrees of freedom 
v =2 (1) 30 (5) 60 (10) 120, 250, 500, 1000, oo. For example, for a = 0.05, 
p =5, and v = 10, the desired critical value is obtained as 3.1693. 


646 


1 


vy 100(a/p) 5.0000 


nm & WwW bo 


4.3027 
3.1824 
2.7764 
2.5706 


2.4469 
2.3646 
2.3060 
2.2622 
2.2281 


2.2010 
2.1788 
2.1604 
2.1448 
2.1314 


2.1199 
2.1098 
2.1009 
2.0930 
2.0860 


2.0796 
2.0739 
2.0687 
2.0639 
2.0595 


2.0555 
2.0518 
2.0484 
2.0452 
2.0423 


2.0301 
2.0211 
2.0141 
2.0086 
2.0040 


2.0003 
1.9944 
1.9901 
1.9867 
1.9840 


1.9818 
1.9799 
1.9695 
1.9647 
1.9623 
1.9600 


2 
2.5000 


6.2053 
4.1765 
3.4954 
3.1634 


2.9687 
2.8412 
2.7515 
2.6850 
2.6338 


2.5931 
2.5600 
2.5326 
2.5096 
2.4899 


2.4729 
2.4581 
2.4450 
2.4334 
2.4231 


2.4138 
2.4055 
2.3979 
2.3909 
2.3846 


2.3788 
2.3734 
2.3685 
2.3638 
2.3596 


2.3420 
2.3289 
2.3189 
2.3109 
2.3044 


2.2990 
2.2906 
2.2844 
2.2795 
2.2757 


2.2725 
2.2699 
2.2550 
2.2482 
2.2448 
2.2414 


3 
1.6667 


7.6488 
4.8567 
3.9608 
3.5341 


3.2875 
3.1276 
3.0158 
2.9333 
2.8701 


2.8200 
2.7795 
2.7459 
2.7178 
2.6937 


2.6730 
2.6550 
2.6391 
2.6251 
2.6126 


2.6013 
2.5912 
2.5820 
2.5736 
2.5660 


2.5589 
2.5525 
2.5465 
2.5409 
2.5357 


2.5145 
2.4989 
2.4868 
2.4772 
2.4694 


2.4630 
2.4529 
2.4454 
2.4395 
2.4349 


2.4311 
2.4280 
2.4102 
2.4021 
2.3980 
2.3940 


4 
1.2500 


8.8602 
5.3919 
4.3147 
3.8100 


3.5212 
3.3353 
3.2060 
3.1109 
3.0382 


2.9809 
2.9345 
2.8961 
2.8640 
2.8366 


2.8131 
2.7925 
2.7745 
2.7586 
2.7444 


2.7316 
2.7201 
2.7079 
2.7002 
2.6916 


2.6836 
2.6763 
2.6695 
2.6632 
2.6574 


2.6334 
2.6157 
2.6021 
2.5913 
2.5825 


2.5752 
2.5639 
2.5554 
2.5489 
2.5437 


2.5394 
2.5359 
2.5159 
2.5068 
2.5022 
2.4977 


QBon = 9.05 
Qind — 0.05/p 
Number of comparisons (p) 


5 
1.0000 


9.9248 
5.8409 
4.6041 
4.0321 


3.7074 
3.4995 
3.3554 
3.2498 
3.1693 


3.1058 
3.0545 
3.0123 
2.9768 
2.9467 


2.9208 
2.8982 
2.8784 
2.8609 
2.8453 


2.8314 
2.8188 
2.8073 
2.7969 
2.7874 


2.7787 
2.7707 
2.7633 
2.7564 
2.7500 


2.7238 
2.7045 
2.6896 
2.6778 
2.6682 


2.6603 
2.6479 
2.6387 
2.6316 
2.6259 


2.6213 
2.6174 
2.5956 
2.5857 
2.5808 
2.5758 


6 
0.8333 


10.8859 
6.2315 
4.8510 
4.2193 


3.8630 
3.6358 
3.4789 
3.3642 
3.2768 


3.2081 
3.1527 
3.1070 
3.0688 
3.0363 


3.0083 
2.9840 
2.9627 
2.9439 
2.9271 


2.9121 
2.8985 
2.8863 
2.8751 
2.8649 


2.8555 
2.8469 
2.8389 
2.8316 
2.8247 


2.7966 
2.7759 
2.7599 
2.7473 
2.7370 


2.7286 
2.7153 
2.7054 
2.6978 
2.6918 


2.6868 
2.6827 
2.6594 
2.6488 
2.6435 
2.6383 


7 
0.7143 


11.7687 
6.5797 
5.0675 
4.3818 


3.9971 
3.7527 
3.5844 
3.4616 
3.3682 


3.2949 
3.2357 
3.1871 
3.1464 
3.1118 


3.0821 
3.0563 
3.0336 
3.0136 
2.9958 


2.9799 
2.9655 
2.9525 
2.9406 
2.9298 


2.9199 
2.9107 
2.9023 
2.8945 
2.8872 


2.8575 
2.8355 
2.8187 
2.8053 
2.7944 


2.7855 
2.7715 
2.7610 
2.7530 
2.7466 


2.7414 
2.7370 
2.7124 
2.7012 
2.6957 
2.6901 


The Analysis of Variance 


8 
0.6250 


12.5897 
6.8952 
5.2611 
4.5257 


4.1152 
3.8552 
3.6766 
3.5465 
3.4477 


3.3702 
3.3078 
3.2565 
3.2135 
3.1771 


3.1458 
3.1186 
3.0948 
3.0738 
3.0550 


3.0382 
3.0231 
3.0095 
2.9970 
2.9856 


2.9752 
2.9656 
2.9567 
2.9485 
2.9409 


2.9097 
2.8867 
2.8690 
2.8550 
2.8436 


2.8342 
2.8195 
2.8086 
2.8002 
2.7935 


2.7880 
2.7835 
2.7577 
2.7460 
2.7402 
2.7344 


9 
0.5556 


13.3604 
7.1849 
5.4366 
4.6553 


4.2209 
3.9467 
3.7586 
3.6219 
3.5182 


3.4368 
3.3714 
3.3177 
3.2727 
3.2346 


3.2019 
3.1735 
3.1486 
3.1266 
3.1070 


3.0895 
3.0737 
3.0595 
3.0465 
3.0346 


3.0237 
3.0137 
3.0045 
3.9959 
3.9880 


2.9554 
2.9314 
2.9130 
2.8984 
2.8866 


2.8768 
2.8615 
2.8502 
2.8414 
2.8344 


2.8287 
2.8240 
2.7972 
2.7850 
2.7790 
2.7729 


10 
0.5000 


14.0890 
7.4533 
5.5976 
4.7733 


4.3168 
4.0293 
3.8325 
3.6897 
3.5814 


3.4966 
3.4284 
3.3725 
3.3257 
3.2860 


3.2520 
3.2224 
3.1966 
3.1737 
3.1534 


3.1352 
3.1188 
3.1040 
3.0905 
3.0782 


3.0669 
3.0565 
3.0469 
3.0380 
3.0298 


2.9960 
2.9712 
2.9521 
2.9370 
2.9247 


2.9146 
2.8987 
2.8870 
2.8779 
2.8707 


2.8648 
2.8599 
2.8322 
2.8195 
2.8133 
2.8070 


15 
0.3333 


17.2772 
8.5752 
6.2541 
5.2474 


4.6979 
4.3553 
4.1224 
3.9542 
3.8273 


3.7283 
3.6489 
3.5838 
3.5296 
3.4837 


3.4443 
3.4102 
3.3804 
3.3540 
3.3306 


3.3097 
3.2909 
3.2739 
3.2584 
3.2443 


3.2313 
3.2194 
3.2084 
3.1982 
3.1888 


3.1502 
3.1218 
3.1000 
3.0828 
3.0688 


3.0573 
3.0393 
3.0259 
3.0156 
3.0073 


3.0007 
2.9951 
2.9637 
2.9494 
2.9423 
2.9352 


20 
0.2500 


19.9625 
9.4649 
6.7583 
5.6042 


4.9807 
4.5946 
4.3335 
4.1458 
4.0045 


3.8945 
3.8065 
3.7345 
3.6746 
3.6239 


3.5805 
3.5429 
3.5101 
3.4812 
3.4554 


3.4325 
3.4118 
3.3931 
3.3761 
3.3606 


3.3464 
3.3334 
3.3214 
3.3102 
3.2999 


3.2577 
3.2266 
3.2028 
3.1840 
3.1688 


3.1562 
3.1366 
3.1220 
3.1108 
3.1018 


3.0945 
3.0885 
3.0543 
3.0387 
3.0310 
3.0233 


Statistical Tables and Charts 


Table XIII (continued) 


1 
vy 100(a/p) 1.0000 


2 9.9248 

3 5.8409 

4 4.6041 

5 4.0321 

6 3.7074 

7 3.4995 

8 3.3554 

9 3.2498 
10 3.1693 
11 3.1058 
12 3.0545 
13 3.0123 
14 2.9768 
15 2.9467 
16 2.9208 
17 2.8982 
18 2.8784 
19 2.8609 
20 2.8453 
21 2.8314 
22 2.8188 
23 2.8073 
24 2.7969 
25 2.7874 
26 2.7787 
27 2.7707 
28 2.7633 
29 2.7564 
30 2.7500 
35 2.7238 
40 2.7045 
45 2.6896 
50 2.6778 
55 2.6682 
60 2.6603 
70 2.6479 
80 2.6387 
90 2.6316 
100 2.6259 
110 2.6213 
120 2.6174 
250 2.5956 
500 2.5857 
1000 2.5808 
ore 2.5758 


2 
0.5000 


14.0890 
7.4533 
5.5976 
4.7733 


4.3168 
4.0293 
3.8325 
3.6897 
3.5814 


3.4966 
3.4284 
3.3725 
3.3257 
3.2860 
3.2520 


3.2224 
3.1966 
3.1737 
3.1534 


3.1352 
3.1188 
3.1040 
3.0905 
3.0782 


3.0669 
3.0565 
3.0469 
3.0380 
3.0298 


2.9960 
2.9712 
2.9521 
2.9370 
2.9247 
2.9146 


2.8987 
2.8870 
2.8779 
2.8707 
2.8648 


2.8599 
2.8322 
2.8195 
2.8133 
2.8070 


3 
0.3333 


17.2772 
8.5752 
6.2541 
5.2474 


4.6979 
4.3553 
4.1224 
3.9542 
3.8273 


3.7283 
3.6489 
3.5838 
3.5296 
3.4837 
3.4443 


3.4102 
3.3804 
3.3540 
3.3306 


3.3097 
3.2909 
3.2739 
3.2584 
3.2443 


3.2313 
3.2194 
3.2084 
3.1982 
3.1888 


3.1502 
3.1218 
3.1000 
3.0828 
3.0688 
3.0573 


3.0393 
3.0259 
3.0156 
3.0073 
3.0007 


2.9951 
2.9637 
2.9494 
2.9423 
2.9352 


4 
0.2500 


19.9625 
9.4649 
6.7583 
5.6042 


4.9807 
4.5946 
4.3335 
4.1458 
4.0045 


3.8945 
3.8065 
3.7345 
3.6746 
3.6239 
3.5805 


3.5429 
3.5101 
3.4812 
3.4554 


3.4325 
3.4118 
3.3931 
3.3761 
3.3606 


3.3464 
3.3334 
3.3214 
3.3102 
3.2999 


3.2577 
3.2266 
3.2028 
3.1840 
3.1688 
3.1562 


3.1366 
3.1220 
3.1108 
3.1018 
3.0945 


3.0885 
3.0543 
3.0387 
3.0310 
3.0233 


OBon — 0.01 


Qing = 0.01/p 
Number of comparisons (p) 


5 
0.2000 


22.3271 
10.2145 
7.1732 
5.8934 


5.2076 
4.7853 
4.5008 
4.2968 
4.1437 


4.0247 
3.9296 
3.8520 
3.7874 
3.7328 
3.6862 


3.6458 
3.6105 
3.5794 
3.5518 


3.5272 
3.5050 
3.4850 
3.4668 
3.4502 


3.4350 
3.4210 
3.4082 
3.3962 
3.3852 


3.3400 
3.3069 
3.2815 
3.2614 
3.2451 
3.2317 


3.2108 
3.1953 
3.1833 
3.1737 
3.1660 


3.1595 
39,1232 
3.1066 
3.0984 
3.0902 


6 
0.1667 


24.4643 
10.8668 
7.5287 
6.1384 


5.3982 
4.9445 
4.6398 
4.4219 
4.2586 


4.1319 
4.0308 
3.9484 
3.8798 
3.8220 
3.7725 


3.7297 
3.6924 
3.6595 
3.6303 


3.6043 
3.5808 
3.5597 
3.5405 
3.5230 


3.5069 
3.4922 
3.4786 
3.4660 
3.4544 


3.4068 
3.3718 
3.3451 
3.3239 
3.3068 
3.2927 


3.2707 
3.2543 
3.2417 
3.2317 
3.2235 


3.2168 
3.1785 
3.1612 
3.1526 
3.1440 


7 
0.1429 


26.4292 
11.4532 
7.8414 
6.3518 


5.5632 
5.0815 
4.7590 
4.5288 
4.3567 


4.2232 
4.1169 
4.0302 
3.9582 
3.8975 
3.8456 


3.8007 
3.7616 
3.7271 
3.6966 


3.6693 
3.6448 
3.6226 
3.6025 
3.5842 


3.5674 
3.5520 
3.5378 
3.5247 
3.5125 


3.4628 
3.4263 
3.3984 
3.3763 
3.3585 
3.3437 


3.3208 
3.3037 
3.2906 
3.2802 
3.2717 


3.2646 
3.2248 
3.2067 
3.1977 
3.1888 


8 
0.1250 


28.2577 
12.4715 
8.1216 
6.5414 


5.7090 
5.2022 
4.8636 
4.6224 
4.4423 


4.3028 
4.1918 
4.1013 
4.0263 
3.9630 
3.9089 


3.8623 
3.8215 
3.7857 
3.7539 


3.7255 
3.7000 
3.6770 
3.6561 
3.6371 


3.6197 
3.6037 
3.5889 
3.5753 
3.5626 


3.5110 
3.4732 
3.4442 
3.4214 
3.4029 
3.3876 


3.3638 
3.3462 
3.3326 
3.3218 
3.3130 


3.3057 
3.2644 
3.2457 
3.2365 
3.2272 


9 
0.1111 


29.9750 
11.9838 
8.3763 
6.7126 


5.8399 
5.3101 
4.9570 
4.7058 
4.5184 


4.3735 
4.2582 
4.1643 
4.0865 
4.0209 
3.9649 


3.9165 
3.8744 
3.8373 
3.8044 


3.7750 
3.7487 
3.7249 
3.7033 
3.6836 


3.6656 
3.6491 
3.6338 
3.6198 
3.6067 


3.5534 
3.5143 
3.4845 
3.4609 
3.4418 
3.4260 


3.4015 
3.3833 
3.3693 
3.3582 
3.349] 


3.3416 
3.299] 
3.2798 
3.2703 
3.2608 


10 
0.1000 


31.5991 
12.9240 
8.6103 
6.8688 


5.9588 
5.4079 
5.0413 
4.7809 
4.5869 


4.4370 
4.3178 
4.2208 
4.1405 
4.0728 
4.0150 


3.9651 
3.9216 
3.8834 
3.8495 


3.8193 
3.7921 
3.7676 
3.7454 
3.7251 


3.7066 
3.6896 
3.6739 
3.6594 
3.6460 


3.5911 
3.5510 
3.5203 
3.4960 
3.4764 
3.4602 


3.4350 
3.4163 
3.4019 
3.3905 
3.3812 


3.3735 
3.3299 
3.3101 
3.3003 
3.2905 


647 
15 20 
0.0667 0.0500 
38.7105 44.7046 
14.8194 16.3263 
9.5679 10.3063 
7.4990 7.9757 
6.4338 6.7883 
5.7954 6.0818 
5.3737 5.6174 
5.0757 5.2907 
4.8547 5.0490 
4.6845 4.8633 
4.5496 4.7165 
4.4401 4.5975 
4.3495 4.4992 
4.2733 4.4166 
4.2084 4.3463 
4.1525 4.2858 
4.1037 4.2332 
4.0609 4.1869 
4.0230 4.1460 
3.9892 4.1096 
3.9589 4.0769 
3.9316 4.0474 
3.9068 4.0207 
3.8842 3.9964 
3.8635 3.9742 
3.8446 3.9538 
3.8271 3.9351 
3.8110 3.9177 
3.7961 3.9016 
3.7352 3.8362 
3.6906 3.7884 
3.6565 3.7519 
3.6297 3.7231 
3.6080 3.6999 
3.5901 3.6807 
3.5622 3.6509 
3.5416 3.6288 
3.5257 3.6118 
3.5131 3.5983 
3.5028 3.5874 
3.4943 3.5783 
3.4462 3.5270 
3.4245 3.5037 
3.4137 3.4922 
3.4029 3.4808 


From B. J. R. Bailey, “Tables of the Bonferroni ¢ Statistic,’ Journal of the American Stati- 
stical Association, 72 (1977), 469-478. Abridged and reprinted by permission. 


648 The Analysis of Variance 


Table XIV. Critical Values of the Dunn-Sidak’s Multiple 
Comparison Test 


This table gives the critical values of the Dunn-Sidak’s multiple comparison pro- 
cedure. The critical values are given fora = 0.01, 0.05, 0.10, 0.20; the number 
of comparisons p =2 (1) 10 (5) 40 (10) 50; and the error degrees of freedom 
v = 2 (1) 30, 40, 60, 120, oo. For example, for a =0.05, p =3, and v = 12, the 
required critical value is obtained as 2.770. 


649 


Statistical Tables and Charts 


l6r'e 
106'€ 
vOe’P 
09c'sS 
c8S't 
S66 
Ccr'b 
Oss 
1s9ove 
CIIY 
ILS'p 
069°S 
sole 
COC P 
COL'p 
900°9 
v6’ 
COV 
1cO'S 
Ltr’9 
tp 
lvl b 
G8e'S 
8S0'L 
verb 
CSIs 
Of6'S 
810°8 
906° 
978'S 
1€8°9 
8996 
6LLS 
9LO'L 
pSS'8 
LIO'EI 
6SL'L 
8£001 
SI8°ZI 
LOT 


9£6 FI 
I9L'12 
90C If 
97S'OL 


0S 


ILe"t 
6LL't 
SLIP 
ccl's 
Othe 
C98 
88C'P 
cOe'S 
LIS’ 
tL6t 
9Cr'P 
67S 
1c9'e 
CII'P 
SO09'P 
978°S 
8SLE 
967 P 
vrs'P 
Orc 9 
9P6'e 
cSS'P 
O81'S 
608°9 
Otc P 
Ot6'b 
c89'S 
IOL'L 
vS9'b 
Ors’s 
90S°9 
9776 
cers 
L99°9 
690°8 
L6C'Cl 
1802 
10€6 
C8811 
OLS'07 


IS] 
6Sh 61 
806 LC 
6L0'¢9 


4 


660° 
SOL'e 
0) 
SPo's 
19¢°¢ 
88L't 
80¢ PF 
917s 
8tre 
168°t 
Ipc pv 
vers 
9Ec'e 
td'P 
CIS’ 
OcL'S 
$99'¢ 
861° 
OPlt 
LOI'9 
CHB 
lprb 
6S0°S 
£99°9 
OOI'P 
86L'b 
9ec'S 
ISL 
80S'P 
ples 
Li¢9 
1L6'8 
LE7s 
ceP9 
O6L°L 
v8stl 
cso 
988°8 
LSell 
0L9°61 


v8P cl 
66181 
C0192 
b00'6S 


SE 


SITE 
1c9'e 
LIO'P 
ts6'P 
plore 
669°¢ 
LI? 
LIS 
OPE’ 
96L't 
trv 
97E°S 
8EP'E 
1c6'€ 
sort 
86S 
6SS't 
980° 
1c9'P 
696'S 
SCL 
vier 
tc6'b 
L6r'9 
S96't 
6797 
CLES 
80E°L 
trey 
L81°S 
solo 
v89°8 
S10°S 
691°9 
O8P'L 
ver it 
s6v9 
Lcv'8 
8LL 01 
8L9°81 


cSS'lI 
SP8°9l 
COL Pe 
9C9' PS 


0€ 


BIT 
CCS 
9I6€ 
SP8'b 
cLIt 
C6" 
O10'P 
100°S 
6E7'E 
989°t 
8cl'P 
661'S 
pee’ 
7O8'E 
O8¢'P 
LSPs 
sere 
CS6'E 
C8 P 
LO8’S 
88S" 
LOI? 
vOL'p 
90¢°9 
808° 
LLY? 
c8I'S 
890°L 
esl'v 
CLO'b 
198°S 
SSt8 
cOl'b 
OL8°S 
LoVe 
706 01 
160°9 
bI6l 
6c1 01 
696°L1 


Les‘ol 
ILE ST 
vSO'CC 
S98 '6P 


SC 


666 ¢ 
cOP'’ 
colt 
bILb 
8r0'e 
89P't 
088°¢ 
098'P 
801° 
cSS't 
686'¢ 
9P0°S 
ssl’ 
8st 
6cI'P 
L8c'S 
987'¢ 
86Lt 
It b 
e19's 
ccre 
066'¢ 
vLS'b 
LLO'9 
079'¢ 
CLO b 
9967 
C8L9 
876'E 
8IL'P 
eLs's 
896 °L 
89P'b 
1cs's 
vIL9 
v67c'01 
879'S 
9cEL 
L8¢°6 
00e OT 
vip'6 
Ipl tl 
IcL'6l 
86S bb 


0~ 


Sv8'c 
Lec’ 
9E9' 
Les'p 
883°C 
90t'€ 
SIL'¢ 
C89'b 
1P6'C 
O8t't 
CIse 
CSBP 
800° 
bly’ 
8L6£ 
cLO'S 
S60'¢ 
86S'¢ 
SOI? 
OLe'S 
vive 
89L'¢ 
ee bp 
16L'S 
S8e'e 
S10 
SLO’ 
8cr9 
6r9'¢ 
corr 
6IcS 
16r'L 
LOI? 
L60°S 
vI79 
9S°'6 
9L0°S 
Lc9'9 
S0s’8 
96L'v1 
8£1°8 
068°11 
cLO'LI 
O079'8E 


Sl 


879°C 
6c0'€ 
OPE 
SI¢e'p 
£99°7 
6L0'¢ 
p8Pr'e 
verb 
90L'2 
Crit 
89S" 
v8s'P 
19L°7 
WE 
SL9'e 
8LL'P 
CEB 
pee’ 
918" 
8t0'S 
6267 
Sort 
110° 
por’s 
990°¢ 
899'¢ 
96¢'P 
pS6s 
CLoE 
S86 
Lbl'v 
C989 
ceo’ 
CHS P 
c9S'S 
009°8 
LLt'b 
vel s 
v6e'L 
v06'Z1 
0¢9°9 
169°6 
Le6 El 
BCS IE 


OL 


ILS°¢ 
tL67 
6SE°E 
9ST P 
S09°¢ 
IcO'e 
bere 
Leb 
979°C 
080'¢ 
COs" 
SIS? 
869°C 
csi'e 
LO9’e 
cOL'P 
S9L'C 
STE 
trle 
CS6'P 
998°C 
88e'e 
6c6'£ 
90¢°S 
S86'C 
O8S"t 
00C'P 
segs 
CBI 
088" 
0t9'P 
90L'9 
8IS'e 
90b'P 
cOP’sS 
LOE’ 
6070 P 
cES'S 
BCI L 
espTl 
cLE9 
8816 
807 €1 
806 6c 


6 


80°C 
O16'C 
967'¢ 
681° 
6¢S'°C 
Ss6? 
BSE'e 
OOt’P 
8LS°C 
TIO’ 
bere 
6th Pv 
L797 
€80°t 
ESE 
619'P 
069°C 
OLIE 
199°¢ 
098'P 
SLL? 
cOE'E 
8E8"e 
861'S 
$68°C 
v8P'e 
S60't 
POLS 
6L0'¢t 
sole 
10S't 
sts9 
16£°€ 
LSC'p 
8777'S 
CII’8 
8c0'P 
vor’s 
cr8'9 
99611 
v06'S 
969°8 
6rr cl 
961 °87 


8 


(d) suosisedwiod jo saquinn 


9EV'C 
8E8°C 
v7’ 
vil py 
Sorc 
188°C 
e87e 
I@c'p 
10S°c 
vL6c 
SSee 
ser 
OPS 
100°¢ 
8rr'e 
9CS'b 
S09°C 
880°C 
69S°¢ 
9SL'b 
v89°c 
907° 
9ELt 
8L0°S 
S6L°C 
OLE' 
8L6¢ 
6SS°S 
y96'C 
8E9'' 
8SE'P 
9Pe'9 
OST E 
£60'P 
9€0'S 
CEB'L 
6c8't 
gso’s 
6cS'°9 
Ser II 
cIS’s 
060°8 
6£9' TI 
CLE'9T 


Z 


ESSEC 
9SLT 
IP’ 
6c0'P 
O8e'C 
96L°7 
961° 
671° 
circ 
Sv8'C 
p9Te 
9Sc'b 
Src 
LO6'C 
Ise’ 
6lv'P 
80S°C 
L86°C 
yore 
Leo'p 
6LS°7 
L60'¢ 
0¢9't 
1p6't 
189°C 
eSce 
Crs 
bots 
ves 
c6re 
L6I'P 
ee19 
c60°E 
606'¢ 
C78'P 
O@S'L 
019° 
O8L'P 
c8I°9 
ess Ol 
680°S 
O8r'L 
69L 01 
CIP re 


9 


bST'C 
899°C 
bro’ 
Le6't 
6L7°C 
S69°¢ 
b60'e 
COD 
80¢°C 
6tLC 
LSI’ 
Ivl't 
Sree 
96L'C 
LEwe 
v6c'P 
C6 °C 
698°C 
CHEE 
86h b 
LSP 
696°C 
b8Pre 
c8L P 
Lys'c 
OI’ 
069°¢ 
c07'S 
€89°C 
Lee’ 
ratte, 
888°S 
116°C 
669°¢ 
LLS'P 
991-2 
COC 
ILP'P 
66L'S 
10701 
8c9'P 
9189 
€78'6 
C8C CT 


S 


ce17 
6¢S °C 
26'C 
post 
vst 
ILS°¢ 
0L6'2 
C68°¢ 
O87 
119°? 
Lee 
c00'P 
CITT 
199°C 
660°¢ 
erly 
VST? 
OCLC 
tél 
Iter 
60¢°C 
vise 
Ieee 
16S't 
L8C°7 
6£6 °C 
SOs" 
LL6¢ 
cOS°c 
6C1'e 
16Le¢ 
66S°S 
L69°C 
CSP 
06C'P 
ISL9 
LLO'’ 
SITt 
gse’s 
tSv'6 
SII? 
180°9 
vLlL 8 
S766 


v 


¢L6'I 
p8e'? 
OLL'Z 
Lpo'e 
£661 
cIv'7 
118° 
9CL'E 
vloz? 
OPC 
098°C 
CC8'e 
1p0°C 
88P'C 
C6 C 
CS6'E 
SLO'C 
vrs’ 
S00't 
OIF 
071°C 
819°C 
CTI€ 
ecse'p 
ysl 
CcL 7 
plo’ 
S69'b 
BLOC 
C88°C 
BIS" 
tyes 
vere 
OSI 
IP6'e 
8c 9 
vel c 
069°¢ 
9¢9'P 
¢9S°8 
Teste 
tyes 
C8S'L 
BPC LI 


€ 


OSL'I 
v9l'? 
€SS°T 
Lee 
COL] 
981°C 
988°C 
Sore 
66L'1 
tlt? 
9¢9'C 
O8S't 
66L'1 
9PTC 
LLO'C 
889° 
vcs'l 
687°C 
tpl 
Iege 
8S8'l 
Lee c 
CEB 'C 
LcO'P 
v06'l 
Bch C 
6S6°C 
SIe'p 
€L6l 
6rS'C 
CSTE 
ILL Y 
v80'C 
Isle 
I8p'e 
v6s's 
bv67'C 
6rl't 
OSI? 
Lepl 
878°C 
tre v 
v9l'9 
ILO'¢1 


c 


070 
010 
S00 
100 
0c'0 
O10 
S00 
10°0 
0c':0 
O10 
S00 
100 
070 
O10 
S00 
100 
070 
010 
S00 
100 
070 
O10 
S00 
100 
0c'0 
O10 
S00 
100 
0c'0 
0) Be) 
S00 
10°0 
070 
O10 
S00 
100 
0c'0 
O10 
S00 
100 
07'°0 
O10 
S00 
100 
Le) 


Cl 


I 


01 


The Analysis of Variance 


650 


esle 
pore 
LSLe 
tlp Pp 
891e 
p8P'e 
c8LE 
OSt'P 
S8Ie 
sos’ 
608°¢ 
l6v'b 
pOTe 
Ors 
6£8't 
9ES'P 
SCTE 
LSS’ 
cLE’ 
88S P 
6b" 
L8S'¢ 
O16 
9P9'P 
9LTE 
COE 
pSé6t 
cILY 
LOC’ 
799'¢ 
t00'P 
88LP 
Cree 
80L'¢ 
090° 
LL8¥ 
p8C'’e 
19L'¢ 
8cl PF 
C86 P 
cere 
S78 
LOC b 
801° 


os 


6S0'€ 
CLE’ 
L99°¢ 
veep 
clo’ 
06C'€ 
069°¢ 
6St'P 
680°¢ 
IIpe 
SIL'e 
L6t'b 
901 °¢ 
cere 
trl t 
lbb'p 
971 
6Sh'e 
SLL 
68h P 
Brie 
L8P'¢ 
O18'e 
brs P 
eLi'¢ 
6IS'¢ 
Iss’ 
LO9'b 
107°¢€ 
9sc't 
L68't 
6L9'b 
bee’ 
66S°¢ 
IS6'¢ 
vOLP 
CTLOE 
6b9'¢ 
vlO'P 
C98'P 
LI€'’ 
80L'¢ 
880° 
186° 


OV 


cOO'E 
LI¢'€ 
C19 
OLT'P 
910 
vee’ 
peore 
por’ 
1c0°€e 
bse’ 
6$9'¢ 
Cheb 
8P0'¢ 
OLE’ 
989°¢ 
C8ep 
990°¢ 
OOr'e 
9IL¢ 
Oth b 
L80°¢ 
Lev’ 
OSL't 
v8b'P 
IIl¢ 
BSP 
68L'¢ 
brS'b 
8tl¢ 
core 
pest 
vI9'p 
691°¢ 
pes'e 
S88'¢ 
969°P 
cOc't 
C8S'E 
9P6't 
c6L b 
8hC£ 
8t9'€e 
LIO'P 
906'P 


Se 


LE67 
esce 
Oss't 
800 P 
0S6'7 
OLTE 
ILG'¢ 
lpc'p 
96°C 
887 
vost 
LLUb 
6L6°7 
60¢'¢ 
0c9'e 
LI¢ > 
L66°C 
CEE’ 
6P9'¢ 
COC b 
LIO’ 
BSEE 
189°¢ 
vlbr 
6£0'€ 
L8C'¢ 
8ILt 
cCLYY 
y90'¢ 
Ocr't 
19L¢ 
OPS b 
v60'¢ 
6SP'¢ 
O18'e 
819° 
8Cl'€ 
vos’e 
LO8'e 
OIL b 
891° 
LSc'e 
St6'e 
618° 


0€ 


6S8°C 
LLIt 
SLYe 
Selb 
IL8°C 
c6l't 
core 
99I"P 
v88°C 
O17 Ee 
LIS’ 
10c'P 
868°C 
677'E 
Ips‘ 
6£eb 
C167 
Isc’ 
695° 
C80 p 
C6 C 
cLoE 
66S'¢ 
CLEP 
vS6'c 
coe’ 
peo’ 
L8t'P 
8L6¢ 
beet 
SLO’ 
Isp 
S00’ 
Olt’ 
IcLe 
9CS 
9C0'e 
CIP’ 
SLL 
vl9oP 
plo’ 
core 
6£8'e 
BIL b 


Sc 


COLT 
c80°¢ 
c8e'e 
9b0'b 
tLlc 
860° 
cOP'e 
SLOP 
S8L'c 
bli’ 
CCb'E 
801° 
86L°C 
cel’ 
Shr’ 
trl b 
C18 
cS 
OLb't 
S8I'P 
O£8°C 
pLit 
66h '¢ 
lec py 
6b8°C 
661°¢ 
ces'€ 
b8c'b 
1L8°% 
877 
69S't 
beep 
968°C 
COTE 
CI9'E 
viv y 
v76'C 
10¢"¢ 
c99'E 
L6r'P 
8S6'T 
Lee’ 
Cele 
S6S'P 


4 


9€9°T 
196° 
v9TE 
OL6'e 
969°C 
vlL67 
187 
LS6¢ 
9S9°7 
686°C 
OOt'€ 
L86¢ 
899°C 
S00’ 
Octet 
120" 
789° 
CCE 
tree 
6S0'b 
969°C 
CrO'’e 
OLE 
COI’ 
tlle 
990't 
66¢'£ 
OSI'P 
CELT 
c60'€ 
cere 
90C PF 
pSLe 
CCl'e 
CLP’ 
IL@b 
6LL7 
LSit 
8IS'¢ 
Leer 
608°C 
861'¢ 
ILG'€ 
8tb'P 


SL 


csr? 
L8L°C 
v60'¢ 
99L'¢ 
core 
86L °C 
601°¢ 
06L'¢ 
CLY? 
0187 
Scl'e 
LI8st 
C8P'C 
£8.06 
trl’ 
8Ps'e 
c6r'7 
6E8'C 
cole 
188°¢ 
sos'c 
LS8°C 
O81" 
0c6'¢ 
61S 
9L8C 
TIZE 
£96'¢ 
ceoz 
868°C 
lpee 
clo'v 
eso 
¥C6'T 
SL7e 
ILO’ 
plo? 
tS6'C 
vie’ 
8eI'P 
666°C 
886°C 
19¢"¢ 
BIT F 


OL 


LOP'? 
ple 
OsOo'e 
cL 
SIvc 
CSLT 
p90'¢ 
Lele 
vcr? 
v9OL'? 
O80'¢ 
elle 
cere 
LLL 
L60°t 
cO8'€ 
thr’? 
1l6L7¢ 
OIIe 
ces 
ssrz 
808°C 
Bele 
cLB°t 
89°C 
978°C 
cole 
vl6e 
C8rz 
8P8°C 
161’ 
t96'¢ 
OOS" 
CL8°C 
vec’ 
610 
O7S°C 
006°C 
19Z'€ 
v80'r 
vrs'c 
C67 
90¢°¢ 
Col’ 


6 


SET 
069°C 
000°¢ 
Sloe 
19¢@ 
OOL'@ 
blo’ 
869°C 
69¢°C 
IL 
8cO'e 
volt 
8LE°C 
vol 
Sr0'e 
cSLE 
88C°C 
8EL'C 
p90'¢ 
p8Le 
66¢°T 
eSLc 
S80'¢ 
Oc8'e 
lb? 
ILL 7 
80I°€ 
098"¢ 
Scr e 
16L °C 
Stl’ 
LO6't 
lpr? 
18° 
991°¢ 
196¢ 
O9P'7 
1p8°¢ 
COTE 
vc0'Pr 
C8h °C 
TL8 7 
Sree 
660'P 
8 


(d) suosiaedwos jo saquiNN 


C6T7T 
19°C 
Ch6C 
1c9°¢ 
660 C 
1p9°C 
996°C 
CH9'e 
90t°¢ 
Iso? 
OL6'C 
L99°¢ 
Sie? 
£99°C 
986 
C69°¢ 
bee 
9L9°C 
poo't 
StL 
vee? 
169°C 
vc0'’e 
O9L't 
OPEC 
80L'7 
9P0'¢ 
66L'¢ 
6S¢°7 
9CL'C 
cLO'’ 
bryce 
bLe'c 
8PLZ 
101'e 
S68 
COE C 
plle 
Sele 
9S6'¢ 
cv? 
cO8°¢ 
9LI'¢ 
8c0'F 


£ 


It? 
CoC" 
LL8Z 
BSS°¢ 
LCCC 
CLOT 
688°C 
6LS°¢ 
vec? 
18S° 
£067 
c09'¢ 
Ipc? 
t6S°C 
8167 
679°¢ 
OST'T 
s09°c 
vee6c? 
8S9'¢ 
6STC 
619° 
CS6c 
169°¢ 
OLTT 
ve9'T 
vl6? 
8LT% 
C87 C 
CS9°T 
866 ¢ 
ILL¢ 
967°C 
CLOT 
970'€ 
OC8"t 
CIE? 
969°C 
8S0'¢ 
8L8°¢ 
lee? 
ECL ¢ 
960°¢ 
9P6't 


9 


cele 
18¢°¢ 
86L'C 
e8Pr'e 
Ipc 
68h°C 
O18 
cos'e 
Lvl? 
86r 7 
C787 
StS’ 
pst 
80°C 
98°C 
Oss'¢ 
191¢ 
0¢S'7 
CS8'T 
8LS't 
OLI'2 
ce 
698°C 
609°¢ 
6L1°7 
Lest 
688°C 
p99'e 
061°2 
£9S°C 
116¢ 
p89'e 
£07 7 
78ST 
Leto? 
IeLt 
LIZZ 
t09°C 
L96°C 
S8Lt 
bee? 
879'7 
cOO'e 
OSs8'¢e 
S 


8Z0'7 
O8¢'7 
10L7@ 
COE 
ccc 
L8C7 
CILG 
Olre 
6£0°C 
96°C 
CcLZ 
lev’ 
SPO? 
sore 
OCLC 
psp’ 
cSO'C 
SItc 
OSLc 
6Lh¢ 
6S0°C 
9th'T 
99L'C 
80S'¢ 
890°C 
6fVT 
C8L°C 
Ipse 
LLO'C 
Spc 
v08'C 
6LS°¢ 
880°C 
OLY'7 
L738‘ 
cCO'E 
1012 
68°C 
vS3't 
CL9¢ 
9I1e 
TIS 
988°C 
tele 


t 


(panujjuod) AIX 3/4eL 


8881 
Lec? 
plLsc 
CLO’ 
768 I 
vSeTC 
p8s'c 
687'°¢ 
L68'1 
19°C 
v6S°C 
80¢'t 
C061 
697°C 
S09°¢ 
6CE'€ 
8061 
LLOC 
LIV 
eset 
vlé'l 
L8C7C 
1¢9°7 
6Le€ 
1c6'1 
8607 
Ly9'?c 
60F'¢ 
67261 
Ile? 
c99°C 
trre 
8t6 I 
GCOS 
689°C 
C8ht 
6-6'1 
CHES 
60L'7 
87st 
1961 
19¢7% 
LeLc 
cBS'e 


€ 


6389'1 
9S0°C 
COLT 
cole 
889°] 
190° 
OOPr'¢ 
SII 
169°] 
L90°C 
80b°C 
bel’ 
$691 
Lot 
LIP? 
cSIE 
669° 
080° 
Lert 
eLi¢ 
vol’l 
880°C 
6th 
S6I'¢ 
60L'T 
960°C 
cSP'C 
1cc'e 
SIL'I 
9017 
LOY 
Isze 
cell 
811° 
C8P'c 
S87 E 
OL 
lel? 
cOS‘Z 
vce’ 
6£L'I 
OPI 
9CS'C 
ILe'€ 


c 


0z°0 
010 
S00 
100 
0¢'0 
01°0 
S00 
100 
0c'0 
O10 
S00 
10°0 
0c°0 
010 
S00 
100 
07°0 
O10 
$00 
10°0 
07°0 
01°0 
S00 
10°0 
0c°0 
O10 
S00 
100 
02°0 
O10 
S00 
10°0 
0c°0 
O10 
$00 
10°0 
02°0 
O10 
S00 
100 
0c'0 
010 
S00 
10°0 
46] 


tc 


(a6 


Ic 


0c 


61 


81 


LI 


91 


Sl 


vl 


651 


Statistical Tables and Charts 


‘uorsstuliad Aq pajuliday “pec-TEs “(LL61) TL ‘uouDIoossy 
[DINSIIDIS uDII4aWUYy ays fO JDUANOL ,‘S\SPIUOD UO [OIUOD snoouRINUIIS IO} a[qe], J peaoiduly uy,, ‘soWeH ‘Y ‘g UIOL] 


vrs’ ELL 67L'T 8L9°7 LI9Z Ors'7 8tPr~ 6877 6b7'C v07'7 cSIZ 160°2 810°C ¢c6'l 108°I 819°] 070 
SLO’ 800°¢ L967 0767 C987 16L°2 L69° 09S" tes‘ C8P'T vere 8LE°C IIe"? 9777 vil? 676 I O10 
v8t7'e O@e't cB Lele £80'€ SIO'E 876 7 008°C 99L'C LeL'7 €89°7 1¢9°C 69S°7 16P'C 88e °C LETT s0'0 
8ILt 199°¢ Leo'e L8S"¢ 6cS't O8b't core 68¢'€ 09c't 9CTE 88I'¢ trl’ 680°C cCO'E veo? 908°C 100 oo 


668°C b78'C 8LL'C vel’? 099° O8S"c vLyc 6Ie'C 8L7C lee? 8L17 SII? 6£0°C bret LI8'l 1¢9'} 070 
erie cLO'e 670° 8L6°7 8167 €P3'7 vel? 009°7 C9S'C 61S 69b'7 II? CHET vSTT SEZ 8961 010 
99" 867'£ LSTE 607'€ CSI’ 180° L86°C CS8°C 9187 9LLZ 6CL'7 SL9'7 019°? 67ST CCP s9TC c0'0 
OCB bLle SCLC £69'¢C 1p9'e LLS't core cTLEE Ore'e vor’ c9c'e SI@e 8SI'€ L80°¢ 66°C 6S8°C 100 ra 


9°67 9L8°C 878°C CLL SOL'C 179°C I1S°2 Ost 7 80°C 6S7°C v0c'c 6e1°C 190°@ £96'1 ves’ l tp9'l 070 
vite 6t1'€ £60'¢ Or0'¢ 9L67 L68°C t6L'@ evo? £09°C 8SS°C 90S 7 9PY'T elec C87 C e9I'c 686 I O10 
CSP 6Lt't Oct’ v87'e ETE Brie 6b0'E 906° 698°C 978°C LLLZ IZL'@ €$9'7 898°C 9SP'T v67'T c0'0 
196'¢ c68't CSB C08" 6rL 6L9°¢ 68S°¢ 6SP't Scr’ O8e'e CHEE 167'¢€ O£Te cst’ 990° vl6c 100 09 


S10’ 1€6'7 088°C 178° ISL? £99°C 8PS'C C8EC BLES 880 C Itc? v9l'T £80°C £861 OS8'l 999° 0c 0 
68C'€ 80C'€ O9T"t cole 9£0't CS6 °C Crs8'c 989° vp9'c L6S°¢ vrs'c I8P'2 90P' cle? 681°C 600°2 Oro 
CHS pore Sipe C9EE 867 ¢ 817 e mn £96'C €t6'C 8L8°C KG.6 89L°2 969°C 809°C COP 'C t7L'C c0'0 
£60°b 610% SLO’ tt6'e C98'E L8L'¢ 689°¢ 6bS't else CLO't Scere OLE sOe’e CCT E IcIe OL6'7 100 Ob 


9L0'¢ L86°C be6Tc CL8 86L'C LOL’ L8S°¢ vip? 69¢°C LIC? 8ST 7 681°C 901°C £00°C L98°I 699" 070 
99C°E O8c't 677 691" 860° O10’ S68°7 lel‘? L89°7 8t9'C C8S ‘7 LIS‘ @ 6th CHE C SIeC 0£0'C O10 
Leo’ eSc"e cOS't She OLE't 167'¢€ O8t't 1cO'€ 6L67 CLC 8L8°7 9187 CHL 69° 87S" psec $00 
CLO P cS b cOl'v 8b0'P 186°¢ 006’ v6L't vr9o'e S09'¢ 19S" TTS" espe vse’ 867 Ee 881° 6c0'€ 100 Ot 


bso $662 C6 T 088° C08°C CIL‘2 £6S'7 61v'7 ELEC 177 CITT c6l'7 O11? 900° 698'1 1L9°T 070 
LLE'’ 167°¢ 6£7'E 8LI't LOI’ 810° £06°C LELc £69°C vpo'c 8857 CCS bere OPC 817°C te0'C 00 
Oso 99c"t SIs‘ LSp'e L8e't 10e'€ 681°¢ 6c0'e L867 Or6'c 988° C37 8PL'~ cso't ves Te 8ST s0'0 
CST b ILI? CCl P 690'P 866° 916¢ 608"¢ 8S9'¢ 819° vLs't tcS't pore S6t'e 60t't L6I't Leto’ 100 67 


€60'¢ b00'e 0S6'C L88°C CISC 07L'2 666°C vcr 8LE°C 9E'T 9977 961°C ell? 600°C cL8 I cL9'I 020 
B8e'e 10¢"€ 6P7'€ 881° SIVe LZO'€ 1167 brl 7 OOL'T 0S9'7 v6S'7 8cS'C 6br'? Ise? CTT 9€0'C O10 
y99'e 6LS'¢ 8cS'E 69P't 66E'¢ CIE’ 661°¢ 6£0'¢ S66°C 8h6'C £68°C O£8'c SSL'c 199° 6eS°C C9E°C S00 
vLeV l6l'¢ Chl tb v80'b LIO'v CLO’ SéB'E cLO'E cLO'E L8S"t OES't LLY'e LOv't Oct’ LOT 9P0'¢ 10°0 8¢ 


cole ClO’ 6S6'C 968°C 078'¢ LCL‘ c09°¢ 6cP'C C8eC Ore 7 OLeC 1077 LIV? TIO SL8'1 SL9'1 070 
10r'¢ cle’ 197t 661°€ 971 90 6167 ISL? LOL’ LS9°? 009°7 bEs’c src 9St°C LCTC 607 O10 
0389°¢ b6S't crs 't e8P'e IIb’ veer’ OI7e LvO'e vO0'e 9S6'T 7067 8E8°C COL’? 899°C Crs" 89E°C c0'0 
L6c > Clb cOl'P SOI'b 9C0'P cS6'E tps 889°¢ Lp9'e c09'e OSS't 16’ 6lb't CEE’ BIT’ 9S0't 100 L@ 


vibe eco’ 896°C S06°C 680°C Stele TI9'? Strc 88E°C cee? CLoC S072 Ic? rate 828° 1 LLO'| 0c'0 
SIP'e 9CE'E CL7e Iie Lele Lvo'’e 876°C 6SL'7 vILe v99'? L09°7 Ors'z 09b'7 19¢? rare tr0'c 010 
L69°¢ O19'e BSs't L6P't Scere Lee’ CCT“ 8S0'¢ plo’ 996°C 1167 Lvs‘ OLLZ cL9'C ES ar6 tLe? S00 
CCL P Lt Pp O81'P 8cI'b 8S0'P CLOE C98"E SOLE p99" 819° 99c'f cOSs't cere cre’ O£7't 990't 100 92 


9cI't peo’ 6L6°C vl6c 8E8°C trlc 619°C IPP’ v6t'C Ipe'? 0877 O17? sce 020°C [88°T 6L9'1 020 
Otre Ore't 987 E vec’ 6rlt 8S0't 86 C LOL CCL C CLOT vio Les 99°C LOL'T 97 C LvO'? O10 
CIL'e Leo pLs't eIse Ore Ise’ Stee 690°¢ Sc0'e 9L6°7 176°C 9S3°C 6LL'7 €89°C 8SS°C 6LEC s0'0 
OSt'b £9C Pb CIT b CSP 180° c66'¢ C88'E ecLt 789° seore C8S'e Iés't 6Pb't 6SE'¢ Cree LLO'’ 10°0 Sc 


6tI'€ 9P0'¢ 0662 S767 8P8'C CSLZ Le9'7 BPP OOP’ Lee? 9877 S17? OL T° vZ0'C v88'l 789" I 070 
Pht gcse’ tO¢c'e BET E COL’ OLO't 667 LLL TeLz 089°C C79" bsst tLyc ELEC Ive? IsO°c 010 
Sele 9P9'e c6S't les‘ LSpt 99t°f 6Pe et 180°¢ Leo’ 886 ¢ 1t67 998°C 8817 C69 999°C S8EC S00 
O8t'b C60 PF Ove b 6LI'P LOI? 610°? S06'¢ belt cOL'E psore 109° 6eS't Sore CLE’ LS7e 680'¢ 100 ve 


OS Ov SE 0€ SC 0~ SL Ol 6 8 Z 9 S v € c 2 a 


(d) suostaedwiod jo saquuny 
(panuljuod) AIX 1921 


652 The Analysis of Variance 


Table XV. Critical Values of the Studentized Maximum Modulus 
Distribution 


This table gives the critical values of the Studentized maximum modulus 
distribution used in multiple comparisons. The critical values are given for 
a =0.10, 0.05, 0.01; the number of comparisons p=2 (1) 5; and the error 
degrees of freedom v =2 (1) 12 (2) 20, 24, 30, 40, 60, oo. For example, for 
a =0.05, p =3, and v = 12, the required critical value is obtained as 2.75. 


Number of comparisons (p) 
va 2 3 4 5 6 7 8 9 10 11 12 13 14 «#15 


2 0.10 383 4.38 4.77 506 5.30 550 567 582 596 608 618 628 6.37 6.45 
0.05 5.57 634 689 7.31 765 7.93 8.17 838 857 874 889 9.03 9.16 9.28 
0.01 12.73 14.44. 15.65 16.59 17.35 17.99 18.53 19.01 19.43 19.81 20.15 20.46 20.75 21.02 


3 0.10 299 3.37 3.64 384 401 4.15 427 438 447 455 463 4.70 4.76 4.82 
0.05 3.96 443 476 502 5.23 541 556 569 5.81 5.92 601 610 6.18 6.26 
0.01 7.13 7.91 848 892 9.28 958 9.84 10.06 10.27 1045 10.61 10.76 10.90 11.03 


4 0.10 266 2.98 3.20 3.37 3.51 3.662 3.72 3.81 3.89 3.96 402 4.08 4.13 4.18 
0.05 3.38 3.74 400 420 4.37 450 462 472 482 490 498 5.04 5.11 5.17 
0.01 546 5.99 636 666 690 7.10 7.27 743 757 769 780 791 800 8.09 


5 0.10 2.49 2.77 2.96 3.12 3.24 334 343 3.51 358 3.64 3.69 3.75 3.79 3.84 
0.05 3.09 3.40 3.62 3.799 393 404 414 423 431 438 445 451 456 4.461 
0.01 4.70 5.11 540 563 581 597 611 623 633 643 652 660 667 6.74 


6 0.10 239 2.64 282 2.96 3.07 3.17 3.25 3.32 3.38 344 349 3.54 3.58 3.62 
0.05 292 3.19 3.39 3.54 366 3.77 386 3.94 401 4.07 4.13 418 4.23 4.28 
0.01 4.27 461 486 505 520 533 545 555 5.64 5.72 5.80 5.86 5.93 5.99 


7 0.10 2.31 2.56 2.73 286 2.96 3.05 3.13 3.19 3.25 3.31 3.35 340 3.44 3.48 
0.05 280 3.06 3.24 3.38 349 359 3.67 3.74 380 3.86 3.92 3.96 4.01 4.05 
0.01 400 430 451 468 481 493 503 5.12 5.20 5.27) 5.33 5.39 5.45 5.50 


8 0.10 226 249 266 2.78 288 297 304 3.110 3.16 3.21 3.26 3.30 3.34 3.37 
0.05 2.72 2.96 3.13 3.26 3.36 345 353 3.60 3.66 3.71 3.76 3.81 3.85 3.89 
0.01 3.81 4.08 427 442 455 465 4.74 482 489 496 502 5.07 5.12 5.17 


9 0.10 2.22 245 260 2.72 2.82 2.90 297 3.03 3.09 3.13 3.18 3.22 3.26 3.29 
0.05 266 2.89 3.05 3.17 3.27 3.36 343 349 3.55 3.60 3.65 3.69 3.73 3.77 
0.01 3.67 3.92 4.10 4.24 435 445 453 461 467 4.73 479 484 488 4.92 


10 0.10 2.19 241 256 268 2.77 285 292 298 303 3.08 3.12 3.16 3.20 3.23 
0.05 261 283 298 3.10 3.20 3.28 3.35 3.41 347 3.52 3.56 3.60 3.64 3.68 
0.01 3.57 3.80 3.97 4.10 4.20 4.29 437 444 450 456 461 4.66 4.70 4.74 


11 0.10 2.17 2.38 253 2.64 2.73 281 2.88 2.93 2.98 3.03 3.07 3.11 3.15 3.18 
0.05 2.57 2.78 293 3.05 3.14 3.22 3.29 3.35 340 3.45 349 3.53 3.57 3.60 
0.01 3.48 3.71 3.87 3.99 409 4.17 4.25 431 437 442 447 451 455 4.59 


12 0.10 2.15 2.36 2.50 2.61 2.70 2.78 284 2.90 295 2.99 3.03 3.07 3.10 3.14 
0.05 2.54 2.75 2.89 3.00 3.09 3.17 3.24 3.29 3.34 3.39 3.43 3.47 3.51 3.54 
0.01 3.42 3.63 3.78 3.90 400 408 4.15 421 426 431 436 440 444 4.48 


14 0.10 212 2.32 246 257 265 2.72 2.799 284 289 2.93 2.97 3.01 3.04 3.07 
0.05 249 269 283 2.94 3.02 309 3.16 3.21 326 3.30 3.34 3.38 3.41 3.45 
0.01 3.32 352 3.66 3.77 3.85 393 399 405 4.10 4.15 4.19 4.23 4.26 4.30 


16 0.10 210 2.29 243 253 262 269 2.75 280 285 2.89 2.93 2.96 2.99 3.02 
0.05 2.46 2.65 2.78 289 2.97 304 3.10 3.15 3.20 3.24 3.28 3.31 3.35 3.38 
0.01 3.25 3.43 3.57 3.67 3.75 382 388 394 3.99 403 407 4.11 4.14 4.17 


Statistical Tables and Charts 653 


Table XV (continued) 


Number of comparisons (p) 
Vv a 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


18 O10 208 227 241 251 259 266 2.72 2.77 281 285 289 2.92 2.96 2.99 
0.05 243 262 2.75 2.85 293 3.00 3.05 3.11 3.15 3.19 3.23 3.26 3.29 3.32 
0.01 3.19 3.37 350 360 368 3.74 380 385 390 3.94 3.98 401 4.04 4.07 


20 0.10 207 226 2.39 249 257 263 269 274 279 283 286 2.90 2.93 2.96 
0.05 241 259 2.72 282 290 296 3.02 307 3.11 3.15 3.19 3.22 3.25 3.28 
0.01 3.15 3.32 3.45 354 3.62 3.68 3.74 3.79 3.83 3.87 3.91 3.94 3.97 4.00 


24 0.10 205 2.23 236 246 253 260 266 270 2.75 279 282 285 288 2.91 
0.05 2.38 256 2.68 2.77 285 291 297 3.02 3.06 3.10 3.13 3.16 3.19 3.22 
0.01 3.09 3.25 3.37 346 353 3.59 3.64 3.69 3.73 3.77 3.80 3.83 3.86 3.89 


30 «60.10 «203 2.21 233 243 250 257 262 267 2.71 275 2.78 281 2.84 2.87 


0.05 235 252 264 2.73 280 287 292 296 3.00 3.04 307 3.11 3.13 3.16 
0.01 3.03 3.18 3.29 3.38 345 3.51 355 3.60 3.64 3.67 3.70 3.73 3.76 3.78 


40 0.10 201 2.18 230 240 247 253 258 2.63 267 2.71 2.74 2.77 2.80 2.82 
0.005 2.32 249 2.60 269 276 282 287 291 2.95 2.99 3.02 3.05 3.08 3.10 
0.01 2.97 3.112 3.22 3.30 3.37 3.42 3.47 351 3.54 3.58 3.61 3.63 3.66 3.68 


60 0.10 199 2.116 228 237 244 250 255 259 263 267 2.70 2.73 2.76 2.78 
0.05 2.29 245 256 2.65 272 2.77) 282 286 2.90 2.93 296 2.99 3.02 3.04 
0.01 291 3.05 3.15 3.23 3.29 3.34 3.38 342 346 3.49 351 354 3.56 3.59 


coo )«€©60.10 1.95 © 2.11 223 231 238 243 248 252 256 259 262 265 2.67 2.70 
0.05 2.24 239 249 257 263 268 2.73 2.77 280 283 286 288 2.91 2.93 
0.01 281 293 3.02 309 3.14 3.119 3.23 3.26 3.29 3.32 3.34 3.36 3.38 3.40 


From R. E. Bechhofer and C. W. Dunnett, “Comparisons for Orthogonal Con- 
trasts: Examples and Tables,” Technometrics, 24 (1982), 213-222. Abridged 


and reprinted by permission. 


654 The Analysis of Variance 


Table XVI. Critical Values of the Studentized Augmented 
Range Distribution 


This table gives the critical values of the Studentized augmented range dis- 
tribution used in multiple comparisons. The critical values are given for a = 
0.20, 0.10, 0.05, 0.01; the number of comparisons p = 2 (1) 8; and the error 
degrees of freedom v = 5, 7, 10, 12 (4) 24, 30, 40, 60, 120, oo. For example, 
for a = 0.05, p =4, and v = 16, the desired critical value is obtained as 4.050. 


Number of comparisons (p) 


Vv a 2 3 4 5 6 7 8 
5 .20 2.326 2.935 3.379 3.719 3.99] 4.215 4.406 
10 3.060 3.772 4.282 4.671 4.982 5.239 5.458 
05 3.832 4.654 5.236 5.680 6.036 6.331 6.583 
01 5.903 7.030 7.823 8.429 8.916 9.322 9.669 
7 .20 2.213 2.783 3.195 3.508 3.757 3.963 4.137 
.10 2.848 3.491 3.943 4.285 4.556 4.781 4.972 
.05 3.486 4.198 4.692 5.064 5.360 5.606 5.816 
01 5.063 5.947 6.551 7.008 7.374 7.679 7.939 
10 .20 2.133 2.676 3.066 3.359 3.592 3.783 3.944 
.10 2.704 3.300 3.712 4.021 4.265 4.466 4.636 
.05 3,259 3.899 4.333 4.656 4.913 5.124 5.305 
01 4.550 5.284 5.773 6.138 6.428 6.669 6.875 
12 .20 2.103 2.636 3.017 3.303 3.530 3.715 3.872 
.10 2.651 3.230 3.628 3.924 4.157 4,349 4.511 
05 3.177 3.791 4.204 4.509 4.751 4.950 5.119 
01 4.373 5.056 5.505 5.837 6.101 6.321 6.507 
16 .20 2.066 2.587 2.958 3.235 3.453 3.632 3.782 
.10 2.588 3.146 3.526 3.806 4.027 4.207 4.360 
.05 3.080 3.663 4.050 4.334 4.557 4.741 4.897 
01 4.169 4.792 5.194 5.489 5.722 5.915 6.079 
20 .20 2.045 2.558 2.923 3.195 3.408 3.582 3.729 
.10 2.551 3.097 3.466 3.738 3.950 4.124 4.271 
05 3.024 3.590 3.961 4.233 4.446 4.620 4.768 
01 4.055 4.644 5.019 5.294 5.510 5.688 5.839 
24 .20 2.031 2.539 2.900 3.168 3.378 3.549 3.694 
.10 2.527 3.065 3.427 3.693 3.901 4.070 4.213 
.05 2.988 3.542 3.904 4.167 4.373 4.541 4.684 
01 3.982 4.549 4.908 5.169 5.374 5.542 5.685 
30 .20 2.017 2.521 2.877 3.142 3.348 3.517 3.659 
.10 2.503 3.034 3.389 3.649 3.851 4.016 4.155 
05 2.952 3.496 3.847 4.103 4.320 4.464 4.602 
01 3.912 4.458 4.800 5.048 5.242 5.401 5.536 
40 .20 2.003 2.502 2.855 3.116 3.319 3.485 3.624 
.10 2.480 3.003 3.352 3.605 3.803 3.963 4.099 
05 2.918 3.450 3.792 4.040 4.232 4.389 4.521 
O01 3.844 4.370 4.696 4.931 5.115 5.265 5.392 
60 .20 1.990 2.484 2.833 3.090 3.290 3.453 3.589 
.10 2.457 2.927 3.315 3.563 3.755 3.911 4.042 
.05 2.884 3.406 3.738 3.978 4.163 4.314 4.441 
01 3.778 4.284 4.595 4.818 4.991 5.133 5.253 
120 .20 1.976 2.466 2.811 3.064 3.261 3.421 3.554 
.10 2.434 2.943 3.278 3.520 3.707 3.859 3.987 
05 2.851 3.362 3.686 3.917 4.096 4.241 4.363 
01 3.714 4.201 4.497 4.709 4.872 5.005 5.118 
oe) .20 1.963 2.448 2.789 3.039 3,232 3.389 3.520 
.10 2.412 2.913 3.243 3.479 3.661 3.808 3.931 
05 2.819 3.320 3.634 3.858 4.030 4.170 4.286 
01 3.653 4.121 4.403 4.603 4.757 4.882 4.987 


From M. R. Stoline, “Tables of the Studentized Augmented Range and Applica- 
tions to Problems of Multiple Comparisons,” Journal of the American Statistical 
Association, 73 (1978), 656-660. Adapted and reprinted by permission. 


Statistical Tables and Charts 655 


Table XVII (a). Critical Values of the Distribution of 4; 
for Testing Skewness 


This table gives the upper-tailed critical values of the sample estimate of the 
coefficient of skewness (7, ). The critical values are given fora = 0.05, 0.01, and 
the sample size n = 25 (5) 50 (10) 100 (25) 200 (5) 500. Since the distribution 
of the statistic / is symmetrical about zero, the one-tailed critical values also 
represent two-tailed values of 0.10 and 0.02. For example, for w =0.05 and 
n = 30, the desired critical value is obtained as 0.661. 


Critical Value (a) Critical Value (a) 

Standard Standard 

n 0.05 0.01 Deviation n 0.05 0.01 Deviation 
25 0.711 1.061 0.4354 100 0.389 0.567 0.2377 
30 0.661 0.982 0.4052 125 0.350 0.508 0.2139 
35 0.621 0.921 0.3804 150 0.321 0.464 0.1961 
40 0.587 0.869 0.3596 175 0.298 0.430 0.1820 
45 0.558 0.825 0.3418 200 0.280 0.403 0.1706 
50 0.533 0.787 0.3264 250 0.251 0.360 0.1531 
60 0.492 0.723 0.3009 300 0.230 0.329 0.1400 
70 0.459 0.673 0.2806 350 0.213 0.305 0.1298 
80 0.432 0.631 0.2638 400 0.200 0.285 0.1216 
90 0.409 0.596 0.2498 450 0.188 0.269 0.1147 
100 0.389 0.567 0.2377 500 0.179 0.255 0.1089 


Table XVII (b). Critical Values of the Distribution of 42 
for Testing Kurtosis 


This table gives upper- and lower-tailed critical values of the sample estimate of 
the coefficient of kurtosis (72). The critical values are given for a = 0.05, 0.01, 
and the sample size n = 50 (25) 150 (50) 1000 (200) 2000. For example, for 
a = 0.05 and n = 50, the upper-tailed critical value is obtained as 3.99. 


Critical Value (a) Critical Value (a) 
Upper Lower Upper Lower 
n 0.01 0.05 0.05 0.01 n 0.01 0.05 0.05 0.01 
50 4.88 3.99 2.15 1.95 600 3.54 3.34 2.70 2.60 
75 4.59 3.87 2.27 2.08 650 3.52 3.33 2.71 2.61 
100 4.39 3.77 2.55 2.18 700 3.50 3.31 212 2.62 
125 4.24 3.71 2.40 2.24 750 3.48 3.30 2.73 2.64 
150 4.13 3.65 2.45 2.29 800 3.46 3.29 2.74 2.65 
850 3.45 3.28 2.74 2.66 
200 3.98 3.57 2.51 2.37 900 3.43 3.28 2.75 2.66 
250 3.87 3.52 2.59 2.42 950 3.42 3.27 2.76 2.67 
300 3.79 3.47 2.59 2.46 1000 3.41 3.26 2.76 2.68 
350 3.72 3.44 2.62 2.50 
400 3.67 3.41 2.64 2.52 1200 3.51 3.24 2.78 2.71 
450 3.63 3.39 2.66 2.55 1400 3.34 g.22 2.80 2.72 
500 3.60 3.37 2.67 2.57 1600 3.32 3.21 2.81 2.74 
550 3.57 3.35 2.69 2.58 1800 3.30 3.20 2.82 2.76 
600 3.54 3.34 2.70 2.60 2000 3.28 3.18 2.83 2.77 


From E. S. Pearson and H. O. Hartley, Biometrika Tables for Statisticians, Vol. I, 
Third Edition, © 1970 by Cambridge University Press, Cambridge. Reprinted 
by permission (from Table 34B and 34C). 


656 


The Analysis of Variance 


Table XVIII. Coefficients of Order Statistics for the 
Shapiro-Wilk’s W Test for Normality 


The table gives coefficients {a,_;4;} (i = 1,2, ...,) of the order statistics for 
determining the Shapiro- Wilk’s W statistic. The coefficients are given for n = 2 
(1) 30. Shapiro and Wilk (1965) used approximations for n > 20. The values 
given here are exact upto n = 30. 


] 0.7071 0.7071 
2 - 0.0000 
3 = _ 
4 - = 
5 


] 0.5601 0.5475 
2 0.3315 0.3325 
3 0.2260 0.2347 
4 0.1429 0.1586 
5 0.0695 0.0922 
6 0.0000 0.0303 
q za = 
8 = a 
9 = _ 


oO 
| 
| 


x 
NO 
mar 
No 
No 


1 0.4664 0.4598 
2 0.3189 0.3167 
3 0.2567 0.2566 
4 0.2106 0.2122 
5 0.1724 0.1756 
6 0.1388 0.1435 
7 0.1083 0.1144 
8 0.0798 0.0873 
9 0.0525 0.0615 
10 0.0261 0.0366 
11 0.0000 0.0121 
12 = zs 
13 x 2 
14 = = 
15 Z és 


10 


0.5739 
0.3290 
0.2141 
0.1224 
0.0399 


19 


0.4808 
0.3232 
0.2561 
0.2059 
0.1641 


0.1271 
0.0932 
0.0612 
0.0303 
0.0000 


29 


0.4214 
0.3014 
0.2525 
0.2168 
0.1878 


0.1628 
0.1404 
0.1200 
0.1008 
0.0827 


0.0654 
0.0486 
0.0322 
0.0160 
0.0000 


20 


0.4734 
0.3211 
0.2565 
0.2085 
0.1686 


0.1334 
0.1013 
0.0712 
0.0422 
0.0140 


30 


0.4168 
0.2993 
0.2516 
0.2169 
0.1886 


0.1643 
0.1427 
0.1228 
0.1044 
0.0869 


0.0702 
0.0541 
0.0383 
0.0229 
0.0076 


From T. J. Lorenzen and V. L. Anderson, Design of Experiments: A No-Name 
Approach, © 1993 by Marcel Dekker, New York. Reprinted by permission. 


Statistical Tables and Charts 657 


Table XIX. Critical Values of the Shapiro-Wilk’s W Test 
for Normality 


This table gives critical values of the Shapiro-Wilk’s W test for normality. The 
critical values are given for ~w = 0.01, 0.02, 0.05, 0.10, 0.50, and n = 3 (1) 50. 


Critical Value (a) 
n 0.01 0.02 0.05 0.10 0.50 


3 0.753 0.756 0.767 0.789 0.959 
4 0.687 0.707 0.748 0.792 0.935 
5 0.686 0.715 0.762 0.806 0.927 
6 0.713 0.743 0.788 0.826 0.927 
7 0.730 0.760 0.803 0.838 0.928 
8 0.749 0.778 0.818 0.851 0.932 
9 0.764 0.791 0.829 0.859 0.935 
10 0.781 0.806 0.842 0.869 0.938 
11 0.792 0.817 0.850 0.876 0.940 
12 0.805 0.828 0.859 0.883 0.943 
13 0.814 0.837 0.866 0.889 0.945 
14 0.825 0.846 0.874 0.895 0.947 
15 0.835 0.855 0.881 0.901 0.950 
16 0.844 0.863 0.887 0.906 0.952 
17 0.851 0.869 0.892 0.910 0.954 
18 0.858 0.874 0.897 0.914 0.956 
19 0.863 0.879 0.901 0.917 0.957 
20 0.868 0.884 0.905 0.920 0.959 
21 0.873 0.888 0.908 0.923 0.960 
22 0.878 0.892 0.911 0.926 0.961 
23 0.881 0.895 0.914 0.928 0.962 
24 0.884 0.898 0.916 0.930 0.963 
25 0.888 0.901 0.918 0.931 0.964 
26 0.891 0.904 0.920 0.933 0.965 
27 0.894 0.906 0.923 0.935 0.965 
28 0.896 0.908 0.924 0.936 0.966 | 
29 0.898 0.910 0.926 0.937 0.966 
30 0.900 0.912 0.927 0.939 0.967 
31 0.902 0.914 0.929 0.940 0.967 
32 0.904 0.915 0.930 0.941 0.968 
33 0.906 0.917 0.931 0.942 0.968 
34 0.908 0.919 0.933 0.943 0.969 
35 0.910 0.920 0.934 0.944 0.969 
36 0.912 0.922 0.935 0.945 0.970 
37 0.914 0.924 0.936 0.946 0.970 
38 0.916 0.925 0.938 0.947 0.971 
39 0.917 0.927 0.939 0.948 0.971 
40 0.919 0.928 0.940 0.949 0.972 
4] 0.920 0.929 0.941 0.950 0.972 
42 0.922 0.930 0.942 0.951 0.972 
43 0.923 0.932 0.943 0.951 0.973 
44 0.924 0.933 0.944 0.952 0.973 
45 0.926 0.934 0.945 0.953 0.973 
46 0.927 0.935 0.945 0.953 0.974 
47 0.928 0.928 0.946 0.954 0.974 
48 0.929 0.937 0.947 0.954 0.974 
49 0.929 0.937 0.947 0.955 0.974 
50 0.930 0.938 0.947 0.955 0.974 


From S. S. Shapiro and M. B. Wilk, “An Analysis of Variance Test for Nor- 
mality (Complete Samples),” Biometrika, 52 (1965), 591-611. Reprinted by 
permission. 


658 


The Analysis of Variance 


Table XX. Critical Values of the D’Agostino’s D Test 
for Normality 


This table gives critical values of the D’Agostino’s D test for normality. The 
critical values are given for ~w = 0.20, 0.10, 0.05, 0.02, 0.01; and n = 10 (2) 
50 (10) 100 (20) 200 (50) 1000 (250) 2000. 


120 
140 
160 
180 
200 


250 
300 
350 
400 
450 


500 
600 
700 
800 
900 


1000 
1250 
1500 
1750 
2000 


0.20 


0.2632, 0.2835 
0.2653, 0.2841 
0.2669, 0.2846 
0.2681, 0.2848 
0.2690, 0.2850 
0.2699, 0.2852 


0.2705, 0.2853 
0.2711, 0.2853 
0.2717, 0.2854 
0.2721, 0.2854 
0.2725, 0.2854 


0.2729, 0.2854 
0.2732, 0.2854 
0.2735, 0.2854 
0.2738, 0.2854 
0.2740, 0.2854 


0.2743, 0.2854 
0.2745, 0.2854 
0.2747, 0.2854 
0.2749, 0.2854 
0.2751, 0.2853 


0.2757, 0.2852 
0.2763, 0.2851 
0.2768, 0.2850 
0.2771, 0.2849 
0.2774, 0.2849 


0.2779, 0.2847 
0.2782, 0.2846 
0.2785, 0.2845 
0.2787, 0.2844 
0.2789, 0.2843 


0.2793, 0.2841 
0.2796, 0.2840 
0.2798, 0.2839 
0.2799, 0.2838 
0.2801, 0.2837 


0.2802, 0.2836 
0.2804, 0.2835 
0.2805, 0.2834 
0.2806, 0.2833 
0.2807, 0.2833 


0.2808, 0.2832 
0.2809, 0.2831 
0.2810, 0.2830 
0.2811, 0.2830 
0.2812, 0.2829 


0.10 


0.2573, 0.2843 
0.2598, 0.2849 
0.2618, 0.2853 
0.2634, 0.2855 
0.2646, 0.2855 
0.2657, 0.2857 


0.2670, 0.2859 
0.2675, 0.2860 
0.2682, 0.2861 
0.2688, 0.2861 
0.2693, 0.2861 


0.2698, 0.2862 
0.2703, 0.2862 
0.2707, 0.2862 
0.2710, 0.2862 
0.2714, 0.2862 


0.2717, 0.2861 
0.2720, 0.2861 
0.2722, 0.2861 
0.2725, 0.2861 
0.2727, 0.2861 


0.2737, 0.2860 
0.2744, 0.2859 
0.2750, 0.2857 
0.2755, 0.2856 
0.2759, 0.2855 


0.2765, 0.2853 
0.2770, 0.2852 
0.2774, 0.2851 
0.2777, 0.2850 
0.2779, 0.2848 


0.2784, 0.2846 
0.2788, 0.2844 
0.2791, 0.2843 
0.2793, 0.2842 
0.2795, 0.2841 


0.2796, 0.2840 
0.2799, 0.2839 
0.2800, 0.2838 
0.2802, 0.2837 
0.2803, 0.2836 


0.2804, 0.2835 
0.2806, 0.2834 
0.2807, 0.2833 
0.2808, 0.2832 
0.2809, 0.2831 


Critical Value (a) 


0.05 


0.2513, 0.2849 
0.2544, 0.2854 
0.2568, 0.2858 
0.2587, 0.2860 
0.2603, 0.2862 
0.2617, 0.2863 


0.2629, 0.2864 
0.2638, 0.2865 
0.2647, 0.2866 
0.2655, 0.2866 
0.2662, 0.2866 


0.2668, 0.2867 
0.2674, 0.2867 
0.2679, 0.2867 
0.2683, 0.2867 
0.2688, 0.2867 


0.2691, 0.2867 
0.2695, 0.2867 
0.2698, 0.2866 
0.2702, 0.2866 
0.2705, 0.2866 


0.2717, 0.2865 
0.2726, 0.2864 
0.2734, 0.2863 
0.2740, 0.2862 
0.2745, 0.2860 


0.2752, 0.2858 
0.2758, 0.2856 
0.2763, 0.2855 
0.2767, 0.2854 
0.2770, 0.2853 


0.2776, 0.2850 
0.2781, 0.2848 
0.2784, 0.2847 
0.2787, 0.2845 
0.2789, 0.2844 


0.2791, 0.2843 
0.2794, 0.2842 
0.2796, 0.2840 
0.2798, 0.2839 
0.2799, 0.2838 


0.2800, 0.2838 
0.2803, 0.2836 
0.2805, 0.2835 
0.2806, 0.2834 
0.2807, 0.2833 


0.02 


0.2436, 0.2855 
0.2473, 0.2859 
0.2503, 0.2862 
0.2527, 0.2865 
0.2547, 0.2866 
0.2564, 0.2867 


0.2579, 0.2869 
0.2591, 0.2870 
0.2603, 0.2870 
0.2612, 0.2870 
0.2622, 0.2871 


0.2630, 0.2871 
0.2636, 0.2871 
0.2643, 0.2871 
0.2649, 0.2871 
0.2655, 0.2871 


0.2659, 0.2871 
0.2664, 0.2871 
0.2668, 0.2871 
0.2672, 0.2871 
0.2676, 0.2871 


0.2692, 0.2870 
0.2708, 0.2869 
0.2713, 0.2868 
0.2721, 0.2866 
0.2727, 0.2865 


0.2737, 0.2863 
0.2744, 0.2862 
0.2750, 0.2860 
0.2755, 0.2859 
0.2759, 0.2857 


0.2767, 0.2855 
0.2772, 0.2853 
0.2776, 0.2851 
0.2780, 0.2849 
0.2782, 0.2848 


0.2785, 0.2847 
0.2788, 0.2845 
0.2791, 0.2844 
0.2793, 0.2842 
0.2795, 0.2841 


0.2796, 0.2840 
0.2799, 0.2839 
0.2801, 0.2837 
0.2803, 0.2836 
0.2804, 0.2835 


0.01 


0.2379, 0.2857 
0.2420, 0.2862 
0.2455, 0.2865 
0.2482, 0.2867 
0.2505, 0.2868 
0.2525, 0.2869 


0.2542, 0.2870 
0.2557, 0.2871 
0.2570, 0.2872 
0.2581, 0.2873 
0.2592, 0.2872 


0.2600, 0.2873 
0.2609, 0.2873 
0.2617, 0.2873 
0.2623, 0.2873 
0.2630, 0.2874 


0.2636, 0.2874 
0.2641, 0.2874 
0.2646, 0.2874 
0.2651, 0.2874 
0.2655, 0.2874 


0.2673, 0.2873 
0.2687, 0.2872 
0.2698, 0.2871 
0.2707, 0.2870 
0.2714, 0.2869 


0.2725, 0.2866 
0.2734, 0.2865 
0.2741, 0.2863 
0.2746, 0.2862 
0.2751, 0.2860 


0.2760, 0.2858 
0.2766, 0.2855 
0.2771, 0.2853 
0.2775, 0.2852 
0.2778, 0.2851 


0.2780, 0.2849 
0.2784, 0.2847 
0.2787, 0.2846 
0.2790, 0.2844 
0.2792, 0.2843 


0.2793, 0.2842 
0.2797, 0.2840 
0.2799, 0.2839 
0.2801, 0.2838 
0.2802, 0.2837 


From R. B. D’ Agostino and M. A. Stephens, Goodness-of-Fit Techniques, © 


1986 by Marcel Dekker, Inc., New York. Reprinted by permission. 


Statistical Tables and Charts 659 


Table XXI. Critical Values of the Bartlett’s Test for Homogeneity 
of Variances 


This table gives critical values of the Bartlett’s test for homogeneity of variances 
having equal sample sizes in each group. Bartlett’s test statistic is the ratio of the 
weighted geometric mean of the sample variances to their weighted arithmetic 
mean (the weights are relative degrees of freedom). The critical values are given 
for a = 0.01, 0.05, 0.10; the number of groups p = 2 (1) 10; and the sample 
size in each group n = 3 (1) 30 (10) 60 (20) 100. We reject the hypothesis of 
homogeneity of variances at the a-level of significance if B < B,(n, a), where 
B is the calculated value of the Bartlett’s statistic and B,(n, @) is the critical 
value having an area of size @ in the left-tail of the Bartlett’s distribution. The 
critical values for equal sample sizes given in this table can also be used to 
obtain a highly accurate approximation of the critical values in the unequal 
sample size case by employing the following relation: 


B,(nj, n2,...,Np,a@) = (ny /N)B,(n1, &) + (n2/N)B)(n2, &) 
+-+-+(np/N)B,(np, &), 


where N = Se nj, Bp(n1,n2,...,Np;a@) denotes the a-level critical value 
of the Bartlett’s test statistic with p groups having nj, n2, ...,p observations, 
respectively, and B,(n;,@),i = 1,2,..., p, denotes the a-level critical value 
in the equal sample size case with n; observations in all p groups. For a given 
p, where p = 2 (1) 10, and for any combination of sample sizes from 5 (1) 
100, the absolute error of this approximation is less than 0.005 (the percentage 
relative error is less than one-half of one percent) when a = 0.05, 0.10, or 
0.25. When w = 0.01, the absolute error is approximately 0.015 in the extreme 
case and less than 0.005 when min (nj, 2, ...,Np)) = 10. The approximation 
can be improved with the help of correction factors given in Dyer and Keating 
(1980, Table 2) and the absolute error of the corrected approximation 1s as small 
as for any other @ values. To illustrate, suppose p = 4 and n, = 5, no = 6, 
n3 = 10,n4 = 50. Using the relation given previoulsy, B4(5, 6, 10, 50;0.01) = 
(5/71) (0.4607) + (6/71) (0.5430) + (10/71) (0.7195) + (50/71) (0.9433) = 
0.8440. Using the correction factors given in Dyer and Keating (1980, 
Table 2) it can be shown that B,(5, 6, 10, 50;0.01) = 0.8364. The exact value 
is B4(5, 6, 10, 50; 0.01) = 0.8359. 


660 


MN 


100 


Number of Groups (p) 
5 6 7 
a = 0.01 


The Analysis of Variance 


Statistical Tables and Charts 


Table XXI (continued) 


GN 


100 


Number of Groups (p) 


5 


3299 
4921 
5952 


.6646 
1142 
71512 
1798 
8025 


8210 
8364 
8493 
8604 
8699 


8782 
8856 
8921 
8979 
9031 


.9078 
9120 
9159 
9195 
9228 


9258 
9286 
9312 
.9336 
9358 


9520 
.9617 
.968 1 
.9761 
.9809 


6 


7 


661 


662 The Analysis of Variance 


Table XXI (continued ) 


Number of Groups (p) 
n 2 3 4 5 6 7 8 9 10 
a= 0.10 


3.4359 399] 3966 4006 4061 4116 — — — 
4 5928 55983 551 3582 5626 5673 717 759 57197 
5 .6842 6539 6507 6530 6566 6605 .6642 .6676 6708 


6 .7429 .7163 .7133 7151 7182 1214 £7245 1274 7301 
7.7834 .7600 STZ 1587 7612 .1640 .1667 .7692 1716 
8 .8130 1921 7895 .1908 .7930 7955 1978 .8000 8021 
9 .8356 8168 8143 8154 8174 8196 8217 8236 8254 
10.8533 8362 8339 8349 8367 8386 8405 8423 8439 


11 .8676 8519 8498 8507 8523 8540 8557 8574 8589 
12 .8794 8649 8629 8637 8652 8668 8683 8698 8712 
13.8892 8758 .8740 .8746 .8760 8775 8789 8803 8816 
14 8976 8851 8833 .8840 8852 .8866 .8879 8892 8904 
15 .9048 8931 8914 8920 8932 8944 8957 .8969 8980 


16 .9110 .9000 8985 .8990 9001 9013 9025 .9036 .9046 
17.9165 .9061 .9046 9051 .9062 .9073 .9084 .9094 .9104 
18 .9214 9115 9101 9106 9115 .9126 9137 .9146 9156 
19 .9257 9163 9150 9154 9163 .9174 9183 .9193 9201 
20 = .9295 .9206 9194 .9198 9207 9216 9226 9234 9243 


21 .9330 9245 .9233 9237 9245 9255 .9263 9272 9280 
22 = .9362 9281 .9269 9273 9281 9289 .9298 .9306 9313 
23.9390 9313 .9302 9305 9313 9321 .9329 .9337 .9344 
24 .9417 .9342 9332 9335 9342 9350 9358 .9365 9372 
25 .9441 .9369 9359 .9362 .9369 9377 .9384 .9391 .9398 


26 .9463 .9394 .9384 9387 .9394 9401 .9408 9415 9421 
27.9484 9417 .9408 .9410 9417 9424 943] 9437 9443 
28 .9503 9439 9429 9432 .9438 9445 9452 9458 .9464 
29 9520 9458 .9449 9452 9458 .9464 947] 9477 9483 
30 = .9537 9477 .9468 9471 .9476 9483 9489 9495 9500 


40 .9655 .9610 .9603 .9605 .9609 .9614 .9619 .9623 .9627 
50.9725 .9689 .9683 .9685 .9688 .9692 .9696 .9699 .9703 
60 .9771 9741 9737 .9738 .9741 .9744 9747 .9750 9753 
80 = .9829 .9806 .9803 .9804 .9806 .9808 9811 .9813 9815 
100 .9864 9845 .9843 .9843 9845 .9847 .9849 9851 9852 


From D. D. Dyer and J. P. Keating, “On the Determination of Critical Values 
for Bartlett’s Test,’ Journal of the American Statistical Association, 75 (1980), 
313-319. Abridged and reprinted by permission. 


Statistical Tables and Charts 663 


Table XXII. Critical Values of the Hartley’s Maximum F Ratio Test 
for Homogeneity of Variances 


This table gives critical values of the Hartley’s maximum F ratio test for ho- 
mogeneity of variances having equal sample sizes in each group. The critical 
values are given for a = 0.05, 0.01; the number groups p =2 (1) 12; and the 
number of degrees of freedom for variance estimate v = 2 (1) 10, 12, 15, 20, 
30, 60, oo. 


Number of Groups (p) 
vy a 2 3 4 5 6 7 8 9 10 11 12 


2 .05 39.00 87.50 142.00 202.00 266.00 333.00 403.00 475.00 550.00 626.00 704.00 
01 199.00 448.00 729.00 1036.00 1362.00 1705.00 2063.00 2432.00 2813.00 3204.00 3605.00 


3.05 15.40 27.80 39.20 50.70 62.00 72.90 83.50 93.90 104.00 114.00 124.00 
01 47.50 85.00 120.00 151.00 184.00 216.00 249.00 281.00 310.00 337.00 361.00 


4 .05 9.60 15.50 20.60 25.20 29.50 33.60 37.50 41.40 44.60 48.00 51.40 
01 23.20 37.00 49.00 59.00 69.00 79.00 89.00 97.00 106.00 113.00 120.00 


5 .05 7.15 10.80 13.70 16.30 18.70 20.80 22.90 24.70 26.50 28.20 29.90 
.O1 14.90 22.00 28.00 33.00 38.00 42.00 46.00 50.00 54.00 57.00 60.00 
6 .05 5.82 8.38 10.40 12.10 13.70 15.00 16.30 17.50 18.60 19.70 20.70 
.O1 11.10 15.50 19.10 22.00 25.00 27.00 30.00 32.00 34.00 36.00 37.0 
7.05 4.99 6.94 8.44 9.70 10.80 11.80 12.70 13.50 14.30 15.10 15.80 
01 8.89 12.10 14.50 16.50 18.40 20.00 22.00 23.00 24.00 26.00 27.00 
8 .05 4.43 6.00 7.48 8.12 9.03 9.78 10.50 11.10 11.70 12.20 12.70 
01 7.50 9.90 11.70 13.20 14.50 15.80 16.90 17.90 18.90 19.80 21.00 
9 .05 4.03 5.34 6.31 7.11 7.80 8.41 8.95 9.45 9.91 10.30 10.70 
01 6.54 8.50 9.90 11.10 12.10 13.10 13.90 14.70 15.30 16.00 16.60 
10.05 3.72 4.85 5.67 6.34 6.92 7.42 7.87 8.28 8.66 9.01 9.34 
01 5.85 7.40 8.60 9.60 10.40 11.10 11.80 12.40 12.90 13.40 13.90 
12 .05 3.28 4.16 4.79 5.30 5.72 6.09 6.42 6.72 7.00 7.25 7.48 
.O1 4.9) 6.10 6.90 7.60 8.20 8.70 9.10 9.50 9.90 10.20 10.60 
15.05 2.86 3.54 4.01 4.37 4.68 4.95 5.19 5.40 5.59 5.77 5.93 
Ol 4.07 4.90 5.50 6.00 6.40 6.70 7.10 7.30 7.50 7.80 8.00 
20.05 2.46 2.95 3.29 3.54 3.76 3.94 4.10 4.24 4,37 4.49 4.59 
01 3.32 3.80 4.30 4.60 4.90 5.10 5.30 5.50 5.60 = 5.80 5.90 
30.05 2.07 2.40 2.61 2.78 2.91 3.02 3.12 3.21 3.29 3.36 3.39 
01 2.63 3.00 3.30 3.40 3.60 3.70 3.80 3.90 4.00 4.10 4.20 
60 .05 1.67 1.85 1.96 2.04 2.11 2.17 Dine 2.26 2.30 2.33 2.36 
01 1.96 2.20 2.30 2.40 2.40 2.50 2.50 2.60 2.60 2.70 2.70 
oo «05 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 
01 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 


From H. A. David, “Upper 5 and 1% Points of the Maximum F-Ratio” 
Biometrika, 39 (1952), 422—424. Reprinted by permission. 


664 


The Analysis of Variance 


Table XXIII. Critical Values of the Cochran’s C Test 
for Homogeneity of Variances 


This table gives critical values of Cochran’s C test for homogeneity of vari- 
ances having equal sample sizes in each group. The critical values are given for 
a = 0.05, 0.01; the number of groups p = 2 (1) 10, 12, 15, 20, 24, 30, 40, 60, 
120; and the number of degrees of freedom for variance estimate v = 1 (1) 10, 


16, 36, 144, oo. 


36 


Number of Groups (p) 


4748 . 
5153. 


4031 . 
4230 . 


.3333 . 
3333 . 


6 


.7808 . 
.8828 . 


6161 . 
7218 . 


5321 
6258 


.4803 
5635 


4447 
5195 


4184 
.4866 


.3980 
.4608 


3817 
4401 


3682 
4229 


3568 
4084 


3135 
3529 


.2612 
.2858 


.2119 
2229 


.1667 
.1667 


4800 
5685 


.4307 
5080 


3974 
4659 


3726 
4347 


3535 
4105 


3384 
3911 


3259 
3751 


3154 
3616 


.2756 
3105 


2278 
.2494 


1833 
.1929 


1429 
1429 


8 


.6798 
.7945 


5157 
.6152 


4377 
.5209 


3910 
4627 


3595 
4226 


.3362 
3932 


3185 
.3704 


3043 
3522 


.2926 
3373 


2829 
3248 


.2462 
.2779 


.2022 
2214 


.1616 
.1700 


.1250 
1250 


9 


.6385 
71544 


4775 
5727 


.4027 
4810 


3584 
4251 


3286 
.3870 


3067 


3592 


.2901 
.3378 


.2768 
3207 


.2659 
3067 


2568 
.2950 


.2226 
2514 


1820 


1446 
1521 


LI 
111 


10 


.6020 
£7175 


.4450 
5358 


3733 
.4469 


3311 
.3934 


.3029 
.3572 


.2823 
.3308 


.2666 
.3106 


254] 
.2945 


.2439 
.2813 


.2353 
.2704 


.2032 
.2297 


.1655 
1992. 


1811 


1308 
.1376 


. 1000 
.1000 


12 


5410 
6528 


3924 
4751 


3264 
3919 


.2880 
3428 


.2624 
.3099 


2439 
.2861 


.2299 
.2680 


.2187 
.2535 


.2098 
.2419 


.2020 
.2320 


1737 
.1961 


.1403 
1535 


.1100 
1157 


.0833 
.0833 


15 


4709 
5747 


3346 
4069 


.2758 
3317 


2419 . 
2882 . 


22)95:-: 
2593: .. 


.2034 . 
2386 . 


D1]. 
2228 . 


1815. 
2104 . 


1736 . 


.2002 


1671 
1918 


1429 
1612 


1144 


1251. 


0889 


.0934 . 


.0667 . 
.0667 . 


20 


3894 
4799 


.2705 
3297 


.2205 
.2654 


24 


3434 . 
4247 . 


2094s. 
.2871 . 


.1907 . 
go295 2 


1656 . 
.1970 . 


1493 . 
1759. 


.1374 . 
1608 . 


1286 . 
1495. 


1216 . 
1406 . 


1160 . 
1388 . 


AN3. 
1283 . 


0942 . 
.1060 . 


.0743 
.0810 


.0567 
0595 


.0417 
0417 


.0604 
0658 


0457 
.0480 


0333 
.0333 


.0887 
1033 


.0827 
0957 


.0780 
.0898 


.0745 
.0853 


.0713 
.0816 


0595 
.0668 


0462 
.0503 


.0347 
.0363 


.0250 
.0250 


40 


.2370 . 
.2940 . 


1567 . 
1915. 


A259: 
1508 . 


1082 . 
1281 


.0968 . 
fL135: 4 


60 


1737 
2151 


1131 
1371 


0895 
1069 


0765 


.0902 


0682 
0796 


.0623 
0722 


0583 
.0668 


0552 
0625 


.0520 
0594 


.0497 
0567 


0411 
.0461 


.0316 
0344 


.0234 
0245 


.0167 
0167 


120 


.0998 
225 


.0632 
.0759 


.0495 
0585 


.0419 
.0489 


.0371 
0429 


.0337 
.0387 


0312 
0357 


.0292 
.0334 


.0279 
.0316 


.0266 
.0302 


0218 
0242 


0165 
.0178 


.0120 
0125 


.0083 
.0083 


From C. Eisenhart, M. W. Hastay, and W. A. Wallis, Techniques of Statisti- 
cal Analysis, Chapter 15, pp. 390-391, © 1947 by McGraw-Hill, New York. 
Reprinted by permission. 


Statistical Tables and Charts 665 


Table XXIV. Random Numbers 


This table gives computer-generated pseudorandom digits that may be drawn 
in any direction: horizontal, left-to-right; vertical, up or down. The numbers 
may be read in single digits, double digits, or digits of any size. For ease of 
reading, the numbers are arranged in groups of five digits which should be 
ignored when reading the table. The table should be employed using a random 
start and crossing out the digits as they are used so that each portion of the table 
is used only once in a given experiment. 


The Analysis of Variance 


666 


99S6S 
909CL 
S9C86 
6S¢7C6 
LLLLI 


IL816 
6tLbL 
8hC99 
C6IS8 
DPPEE 


COOLI 
[L760 
LLOOO 
vSO0r 
LIvSl 


vr669 
L969b 
OZL9S 
OL9LI 
CLObL 


Ipppl 
16099 
OChH8 
C6OLE 
C98P9 


88180 
C1987 
OIeee 
Iptis 
£906¢ 


9CLOS 
[8ZcL 
COL9OP 
6LStL 
cScol 


LLS60 
LSOvC 
8SL78 
6P66L 
9LOLD 


C9L9O9 
8910¢ 
CSE98 
CLe69 
SLESO 


68PS8 
S86LP 
69CtP 
8r IPs 
LICLI 


ILpre 
L8010 
C0878 
81S09 
DELPE 


CL608 
L6cSI 
S¢790 
166£8 
v9r06 


8S1SO 
8009 
CEL8L 
vS060 
0669 


CISLL 
vEePSe 
CL190 
viprl 
[ISt? 


9S8CL 
96PS I 
C8L9I 
99PL9 
19¢69 


LOL8 1 
6rbL0 
eSPpL9o 
L80IL 
OOT6E 


O-860 
L88¢S 
LI6tL 
C686E 
CS8Ie 


99S¢0 
80LL9 
CS8LE 
96810 
VLS96 


LC@CLC 
68ItP 
OI9L6 
LS€6e 
bl60r 


96Sh7 
8IL6e 
OrLer 
celep 
6¢099 


86SPt 
Ivps6 
I87c? 
p8coe 
[e9eL 


Lpsi9 
66818 
CCPEE 
886tS 
C6v6e 


CICSe 
€9£96 
L808 
[L9¢8 
9t66¢ 


96be 
cO8ce 
IOILZ 
816th 
0610 


99¢80 
LIvS8 
[8Pvs 
86LbL 
€6L6C 


Ote9L 
CLY8P 
CLOLE 
C69IL 
888SL 


99£18 
89C6P 
6L87C6 
[1798 
[£688 


Lv860 
61L6¢ 
669¢ I 
CCOLI 
vOIL9 


CL8I8 
68¢TL 
CECC9 
PC8IL 
8Cv08 


CeeP6 
86L6S 
16819 
17996 
8P8C8 


9¢6b9 
9S9PS 
986 
CSI6L 
CCI8t 


6tP6S 
908¢9 
C1p98 
980bS 
8P9CP 


8CC8P 
€1806 
8L8r I 
86L01 
S66IP 


CeOlP 
6LO6S¢ 
88tt0 
OLIt9 
L96CS 


9sces 
SOsto0 
68EIe 
6ET9E 
OcOeE 


I7LI¢ 
C1966 
ChL86 
C9OLC 
vcOL8 


tyrre 
S6C8L 
C989 
SIS6t 
6LSS1 


v66LS 
0866S 
I7Lb9 
96688 
OvS00 


L6SS0 
(4454: 
cELCO 
890¢8 
tvs 16 


[69SL 
00086 
0¢969 
Lt906 
OLPLT 


prs0r 
9It16 
CICel 
LIOI6 
O¢Srl 


89ESS 
90L91 
OL9ST 
S19¢0 
8ILC8 


LCL69 
1O10¢ 
C8LbP 
LEZOl 
[Sper 


LS996 
ve 160 
69L0L 
bePpre 
IcPOol 


9170 
16186 
€LI79 
C1786 
€v666 


€8L8L 
I891L 
6971S 
8LI1S9 
68668 


C89SE 
C6SOL 
CIPLS 
OO8EL 
[16S¢ 
€989e 
CePP9 
OOPIS 
vOCLL 
860C¢S 


7989S 
vis90 
8619 
92808 
C88ce 


£9660 
8£99I 
6669 
6SS6t 
OLCC8 


8PSSP 
[L9CO 
C8P0t 
9LSSS 
6801S 


OcIeL 
Ip8el 
69eS I 
tOLOI 
98PSL 


6899S 
LSIt9 
LL9C9 
Le clp 
87988 


l6SvC 
8SSé6l 
81SL6 
p9tOL 
O£08S 


L8c0C 
CILE8 
vLOLe 
OL69L 
997¢8 


LLvZl 
88C8L 
166S7 
8CSEL 
OtPls 


I$709 
LCL 08 
CIISL 
81S? 
[OSOP 


8c9IL 
999IT 
OCLL6 
9695 I 
91166 


6995S I 
v6rC6 
vLs9l 
Corrs 
89086 


8ScIl 
66898 
e9COl 
L9S66 
16¢6S 


COIel 
€8cL0 
OOLE9 
BP8te 
OPS IZ 


Le06h 
6 Ie! 
LEt6s 
0c901 
9tPSs 


L9OIP 
90181 
[LLb6 
vOI PP 
OLLbE 


l6vel 
[SSE 
C8108 
elIIZ 
€80L7 


Teele 
6L706 
6LLSL 
COOL 
9L60¢ 


LLSSL 
SSPLL 
Cres! 
L7697¢ 
98180 


8LO6E 
8SL¢6 
6¢L96 
VSOLL 
CLeIp 


C8LIE 
60909 
9S986 
8Lt18 
CLESp 


99CLb 
89COL 
OP8CPr 
8906 
LOL8P 


6P6CL 
8LTCT 
SOS I? 
00906 
6b7LS 


67988 
6LELL 
LLI6b 
98CLb 
S9tL6 


COPls 
[Spee 
C810 
€£609 
8SOLP 


Levso 
969¢I 
0898S 
9ECL8 
LIOL6 


pLL6l 
LE 09 
p9OLeO 
v68L8 
CL868 


Crelv 
pice 
ChEES 
8SOIL 
Crpel 


Orrrs 
Lv6cL 
8888P 
vSSCC 
OL6L7 


88796 
09979 
I8pc9 
LI¢cSO 
CO00T 


S60CL 
tELc9 
8L6S9 
8PCLT 
1¢869 


IvLpv 
vSOcr 
666S0 
£6996 
66196 


OCC6E 
tcPvO7 
O8t 16 
SLITO 
8Iel6 


6918 
S19CS 
(65760 
S9L6E 
OCL8Pp 


OrSc0 
€69S8 
9TIE9 
CceLO9 
6L998 


8SC8L 
OvLbs 
[866¢ 
SPIIZ 
988h9 


COSH8 
C98CS 
69791 
SC£96 
S806C 


69162 
669L9 
66692 
L89L8 
ELEvL 


ScvS6 
PLLC 
C66CE 
Ol6tP 
0c9P9 


61COL 
99L9OL 
C878L 
bros l 
cC6F | 


OPLcl 
LOLb8 
CLELO 
vEL99 
CL8PE 


LSLIP 
9CLOL 
99P9¢ 
610SL 
8P60S 


88866 
8e9t0 
OLO8Z 
69906 
C8EL9 


tp96L 
9PO9L 
LC9C8 
pSO0t6 
0690 


vSIol 
OILb6 
O8C68 
O9P IC 
SOECL 


OCr87 
O6tLt 
£8098 
[6L86 
bP86S 


Olvpec 
C6068 
tcTPOS 
069SP 
I9tLp 


68001 
00879 
OtS16 
6rc9I 
vi99p 


[1Sc6 
CC68S 
8S8tl 
6bCCP 
C9OC6L 


8£079 
OICI9 
8ILSI 
pred l 
ELECL 


6087 
cePlO 
LL888 
8S6cr 
801S6 


87O9L 
80PLL 
£7799 
IS¢60 
9P8¢9 


brSood 
[t66L 
SILLS 
9CLIL 
SOI Pr 


O8IIP 
9968 
160¢8 
PLLS¢ 
v6CC6 


6LOL9 
81788 
S10¢O 
9CCOE 
6£90L 


6S9IL 
696b7 
89690 
0$668 
aaass 


€696¢ 
velop 
LOCO8 
OSsss 
SIEP8 


999¢8 
SI8¢cP 
17908 
868bP 
O9Pr8 


6L8L9 
tvcLo 
LC66P 
v9eEs 
69S 


P9199 
[886¢ 
90801 
cS90¢ 
61LL8 


IS901 
98cII 
DIPS 
OS88l 
S06S9 


8SEbL 
998CP 
O9Tel 
I 80L9 
CCLIO 


eoe00 
©6899 
LL8tv 
OTILE 
CS6CI 


CIELL 
CLYIT 
6Lbcl 
8P9t0 
C66SL 


€pc60 
OC8CS 
COvLe 
Ov088 
9LOLI 


LSIb9 
OC8b8 
¢ 1000 
OLOV6 
9CL86 


61918 
Ovr69 
9S LOP 
COCS8 
C99C7 


CCS V6 
LLSOL 
CIT8S 
I 1vOc 
68089 


LOOT8 
CO6CH 
Otc9P 
LLLOI 
P88SL 


S9ISS 
10806 
L8896 
9S8bC 
1986 


veSLp 
IT1é9 
81¢90 
LSEPEe 
bSL96 


[6678 
O0S60P 
C8LLI 
LSt87C 
LIvlv 


617S0 
6b 19 
[v6S8 
68ES I 
cOPrrs 


66-S6 6-06 68-S8 8-08 62-SZL PL-0Z 69-S9 9-09 6S-SS PS-0S 6r-Sbh Pr-Ob GE-SE PE-OE 67-SZ H2-OZ 6I-SL bl-OL 60-SO0 ¥0-00 


667 


Statistical Tables and Charts 


SLPS I 
LILI9 
PSCIS 
v6616 
BLEC8 


68969 
S6I8L 
S806L 
CSOLt 
CHL9l 


8Lech 
Ite 16 
L610¢ 
VETSE 
[7S78 


[Sst 
cE LOO 
80788 
CO66¢ 
€18S0 


CL8S I 
SI IPP 
[8S18 
£99¢8 
LS9O8P 


CCLOL 
SC9EE 
LOOP 
OI7L0 
Ores9 


9SOOL 
LOt6h 
8£790 
OCIIT 
LLLb9 


CLOTS 
OcOLt 
CC96E 
9CbC6 
81016 


0999P 
1$909 
LLOI6 
6PLOL 
86SL9 


8C1S9 
CleI¢ 
YPLCL 
e8es0 
[L096 


809€7 
€06SS 
8c6S9 
ILLOS 
c6ISO 


velco 
L6vS0 
8SOLT 
7906S 
C876t 


9LO6bL 
v86brPr 
6L997 
bLibe 
O9TIL 


6P8Ss 
[¢9¢9 
v6EC7 
eeS98 
pS00c 


Ivlyp 
Occ ly 
CcIeS 
6L6L1 
LILtt 


IL707 
eSTIL 
cO80S 
OLLIE 
t769L 


97767 
OPLec 
897L8 
8699 
c60tP 


L90S¢ 
CL90C 
b9Pbr 
$SL68 
60908 


OL9SL 
06790 
6SS89 
SSv90 
66CCE 


C89SP 
SL9Z9 
S79P9 
e607 1 
S8C8L 


VLbCb 
CCOCS 
C7689 
LI987 
9e98¢ 


889 
CL8Lt 
SeLc0 
86vr9 
CTIE8S 


066¢ I 
8L8SC 
[ecll 
[ele 
LSvEL 


Sr8c0 
SISss 
76199 
SPSOL 
ILSLO 


SLtPL 
C7880 
[9S91 
6£987 
6Lb76 


860PS 
C6C6S 
OL68P 
C8S9L 
I8cOL 


96r 19 
66S8P 
ET 
88516 
OI8Iz 


CLSSO 
I7L78 
Ce8L8 
ItcLp 
CS66P 


C8867 
[p8ts 
tv6cc 
9S¢S6 
O87tO 


96S [0 
v99CP 
9701 
OL8rI 
CeSS8 


98P6r 
[L7SO 
880¢ 1 
tSS389 
cO8Pe 


9CSLP 
tpO8e 
c6lt0 
eserl 
vI769 


971S6 
6SPt6 
O618t 
p8rel 
vC6b6 


6¢LPL 
SISLO 
CLECS 
96968 
pools 


[L6Lb 
196LP 
tyS6s 
COBEE 
SOLI¢ 


9S0¢S 
Otcel 
CLO67 
88881 
CLIS9 


08676 
[8978 
pLIts 
LI6S6 
9¢669 


687C7 
[9891 
LE Ile 
t7OSs 
t6S68 


CeLtS 
t8S9e 
£6660 
COStt 
vLLO9 


c9gsl 
6979 
60LhI 
t£068 
Lvlzl 


6SP Is 
cOl8s 
pO68S 
9ELhe 
66C¢L 


6ILPS 
LOtbL 
vI8L9 
10096 
[vOLZ 


$£069 
Crlel 
SS886 
98°16 
O7L76 


C8esp 
O8IZI 
S00I? 
9c70k 
Sret0 


LLEt? 
O8cSP 
[8162 
6¢66b 
SP86~ 


ISLSp 
S7PLO 
6L796 
6Se9l 
ecCOLL 


O8S Ic 
7900 
th9t6 
LV I86 
OO br 


LEL8b 
991756 
OSLLb 
8LP80 
8eS98 


91708 
[L7t9 
8LET9 
L6v6S 
O9ITPS 


6¢8Lp 
Seeel 
LeOve 
I P000 
C8CIt 


96S¢6 
IL9Le 
bverls 
LeStz 
SLLIL 
c69ll 
6S 109 
OLOSP 
Olpe6 
Iphb6 


61820 
OOT8e 
8ELeP 
60601 
[LOcp 


OS8CL 
IS6bL 
SI8SL 
t986P 
COSTE 


vObLe 
8L6re 
OCSLP 
VbP8C 
ISsip 


LY877 
66709P 
9L199 
91691 
Lv8Te 


9¢9C] 
SL097 
C666 
98062 
00671 


8Pc6s 
8eL9I 
6997S 
909LP 
C7St9 


OPP8z 
888ZL 
O88¢L 
O9L9¢ 
99tL7 


8869 
C6L97 
LI9SO 
18069 
[9@PL 


8COLC 
6COLZ 
L7@917 
8rbt6 
COLHB 


CICLL 
009PS 
[LOLS 
CTIC8S 
L8¥C8 


LILO 
OcSL8 
80860 
CrLO9 
99PL7 


6PeSs 
LLLI6 
LvLS9 
81989 
CO88Pr 


OIL 
BSTee 
L880 
LSS@C 
LOSIP 


6S87¢8 
v98C8 
CS9CL 
[£909 
08879 


£6971 
697b0 
VC8CL 
tl6Ls 
LOS¢@b 


l6lel 
6608 
t0S62 
Loci I 
L6L9b 


prsil 
8rb69 
90S0¢ 
8907S 
C168 


CHh8L6 
CLe9C 
[0SL6 
OLL8I 
BLTEL 


ChOL8 
OILIL 
80L8P 
pe9es 
6SsIl 


8S06S 
96817 
pLes6 
CLIPS 
SI6ZI 


86LES 
8hbrr 
Ot IP~ 
CLI6L 
06992 


[6916 
90SCL 
¥7C98 
eSiIs 
6rlP8 


SOILO 
TLLLI 
97696 
CLt86 
L6rLé 


6968 
8760 
6SSP9 
6b 16L 
SPPLL 


60960 
16089 
69999 
cOc 19 
Serer 


[L178 
SO8ZI 
P8LSS 
6C9LP 
vOre’ 


16126 
91 P6r 
cOScs 
C8092 
CVCLL 


€CTLIOB 
9SOLI 
SOLI6 
C6LOP 
v6cLO 


6tc7l 
6eSss 
vOCLE 
Oc9TI 
CS8LI 


L980 
[8L02 
[Sper 
tS6cl 
Ch669 


LI9bZ 
918¢7 
00rS0 
90LS9 
[vp0c 


vT169 
C88CL 
SeZLII 
O8S8S 
LvS66 


c9Sol 
OCH87 
6 18h 
89Sel 
O8Pbz 


97977 
8POC8 
96769 
CLOPY 
IpLit 


cl¢ec9o 
LIOSS 
C067 
8rsil 
60¢S¢ 


C8LEI 
€98S8 
¢19S0 
Cl6Ch 
O9L81 


90¢70 
SL7Tl 
991€7 
06L99 
LeSI9 


CcIVIO 
v68br0 
9IL9b 
[c061 
CS7LO 


[L¢8Z 
¢SS80 
0698 
tL8S0 
cSels 


OLLSZ 
Olsee 
OII8I 
OLSS9 
90¢L6 


COL6E 
79900 
€9770 
LLO6S 
LLI9b 


890€7 
6SP0e 
p9S6t 
00669 
cecos 


P9C86 
8c 16 
8SOTl 
OL6Sc 
C6LOT 


vePsS 
IIIs 
[176 
OSPrs 
IOpLl 


9StSP 
9¢6¢S 
90I ee 
09900 
IpScL 


ceI88 
vIOIS 
87S 
vLI8P 
9P90h 


LL7O9 
C888 
0¢€808 
S6Lt0 
VCCHS 


9796 
(443-10 
I8cIL 
Orsgos 
06102 


C968 
f1¢99 
67t0t 
L6S96 
OCPLP 


£8660 
79668 
SOLPb 
cSL9OL 
82988 


Lv69 I 
ILvI8 
OL66L 
Orc lO 
LLI68 


LIOSt 
6StL9 
O¢98L 
0SOv0 
CS96¢ 


9¢7e7I 
c6vS6 
80112 
ISTvO 
006¢0 


LOLSI 
CC98S 
69¢t9 
Se9t8 
OSCLL 


Olele 
[C916 
[vLos 
601S9 
tOLS6 


t6c0r 
61760 
[S8ee 
v6ILs 
[S9S8 


SCELS 
STIL8 
O6LLS 
SOsi9 
99196 


OLSbS 
tLes9 
cISOO 
6bCbP 
y99¢9 


86896 
68928 
C1Sz9 
9CIL9 
18800 


Iv 198 
6898¢ 
8LEC8 
OSL8¢ 
6728S 


8LIS8 
LE lS6 
00269 
C888 
c9199 


8crel 
COLO 
Lv9L7 
cOLee 
61898 


DLEPS 
coer I 
v6siP 
06899 
cerle 


CL807 
L68¢L 
Srs8so 
897 
6hCb9 


I7SLO 
9S1S8 
ISLs¢ 
96816 
[7916 


yPSes 
6¢69P 
8789 
O9Sbr 
CSI6L 


eSLie 
TSoos 
00¢68 
OLPSP 
60786 


SSIL6 
L6v6S 
yb780 
[S09 
979Ch 


66-S6 6-06 68-S8 8-08 62-SZ 2-02 69-S9 9-09 65-SS S-0S 6b-SbP br-Ob G6E-SE PHE-OE 672-SZ 2-07 G6I-SL HL-OL 60-S0 60-00 


(panunuo2) AIXX 2{qeL 


The Analysis of Variance 


668 


‘CLV 31qQeL, Wol) uoIsstulied Aq pojutidal pue paidepy 
‘EMO ‘SOULY ‘SSolg AJISIOAIUL) 9181 BMO]T AG 6Q61 © ‘UONIPY WWSIY_ ‘spoyjapy [DINSIIDIS ‘UeIYDOD “DH “AA Pu IODAPOUS “AA “*H WoL] 


LO9¢0 
989¢0 
SLLIL 
LSLEL 
6CC9OL 


809Ct 
8PScl 
Scttp 
8ELEL 
IL9¢th 


t6ttp 
PI8P8 
60879 
C99 
pr lot 


6018P 
vCSb9 
96679 
O8 16S 
1$L66 


C888P 
ceOr | 
S1t6c 
65796 
9E18S 


OI8thr 
OPOeL 
O810¢ 
16St0 
ISSIL 


ILTSs 
L8SIP 
6986 
COCEE 
OSPI8 


StOLe 
9ILL0 
O0OL79 
CIOL? 
v6SCP 


96717 
97066 
9P89l 
8078S 
CCL8L 


L8180 
LOvO8 
Cisse 
98LL6 
Oreso 


66061 
O@7It 
csrel 
OLLSS 
ecSes 


868¢8 
£99CE 
CL9OF 
988h8 
ILLCS 


EPETC 
vOC66 
T98C9 
Ivece 
[7r80 


pv889L 
VLLO8 
OC66L 
8C90L 
Or190 


vP9OL8 
76198 
Lv090 
L6198 
8PISP 


pOrse 
priL6 
£1009 
6799 
80602 


eCLLL 
eSsis 
vScOS 
LL881 
£607 


61197 
160S9 
cOs9e 
COOEL 
L669 


8982S 
LLSL6 
cSP80 
9t0L6 
e816 


CLb86 
88°16 
00SS0 
vetso 
CICCH 


90CLS 
OS6I7 
L00S6 
81190 
tIsel 


VIL86 
6CSCh 
PLL89 
69ltp 
SScLO 


6Ph707 
SCCSC 
ScvLo 
CEESB 
p90EL 


9602 
76066 
CL8bh 
O7@86L 
¢60S9 


$9808 
SLI9I 
C8ES6 
S£768 
tl6vs 


C6Sb6 
CCC? 
LSC06 
CSPEE 
ELETO 


9EPSs 
910rS 
tOrl6 
¢L660 
S6ScO 


L6tSt 
COLLC 
PCOCE 
6CV66 
OLLV6 


00698 
90¢cS 
OCS IC 
68cSP 
[¢S67 


V1688 
9S0r6 
perss 
PtL86c 
8r1P6 


vt089 
99£90 
L879b 
OStIe 
CPOLL 


vt760 
It916 
t69LI 
O0@S68 
eSLE9 


O9P6L 
CLCOCH 
IttLp 
c17S9 
6CTIt 


16769 
S8L¢0 
SIIPO 
Ticss¢ 
8866C 


968L1 
LE8b9 
60EEZ 
98€Z9 
61001 


OLESS 
9r0S0 
68LLC 
tcOll 
8P9PL 


¥CTOT 
CLCECO 
ChO6TL 
86L88 
86tS6 


ILesc 
$1606 
COL06 
LSSC6 
CIL68 


PLISL 
87820 
0918 
t0r68 
160~9 


LS9C8 
160~9 
8PESL 
97190 
tSISp 


9L186 
LOriv 
91T9C 
c6ltO 
99196 


OPhss 
SP9PL 
[L8t0 
£9902 
9LOES 


tclcs 
661S6 
6S£70 
68t6t 
66LL9 


Ch6LO 
C99C8 
pOO8t 
C8SL9 
PpLy60 


6SPL6 
LLS6¢ 
9679 
[t¢96L 
Cv99S 


pt6rs 
8681 
p86r9 
9PISP 
9E9L¢ 


8800P 
9ItLO 
OS8tc 
960S0 
CT199S 


eSS07 
6S280 
17086 
SILIS 
660¢S 


tSc80 
COCHB 
£0966 
981¢9 
C6P9S 


6LSCE 
8t018 
OfSpL 
PelOS 
Crit 


8£6L6 
S66L¢ 
Lt60l 
COOP 
OLOIP 


LIZO8 
LL6I8 
0069r 
8tOIL 
60t IP 


9696S 
trcr6 
LSC96 
p99 I 
t6Scs 


9SLC8 
ecsrl 
L7160 
CEEOB 
C8PS9 


v08c0 
ScoLl 
ELEC 
1Ov9s 
L7689 


ILSt9 
78981 
LO868 
goes 
99988 


LOT8t 
680I1¢ 
87CIS 
Les6s 
6618 


vOL9T 
S66L9 
Lclol 
66006 
9P9L8 


OPEPL 
0L6¢6 
686SP 
LSLE? 
C8SLL 


c6l6l 
bv 1806 
99898 
C9768 
OOrCE 


16ST 
tL6C6 
p66rl 
99817¢ 
I7LbS 


6t80C 
LCvLe 
£0892 
9PCCS 
81190 


L9ccl 
tCC6S 
6P8LS 
SISO 
OOISI 


OIC 
IvOtL 
89LPC 
LEO? 
IpsSi¢ 


IL9SS 
tlO9L 
CbOLY 
C8998 
66092 


9EC9L 
80ETL 
Sr6ls 
p68St 
80S01 


8L6C6 
Ovc9r 
OS9re 
OLeS6 
OvS08 


6EC8L 
IS88s 
POLO! 
O06EC7? 
SOLOS 


CSESE 
L8LSL 
CS66L 
Ch666 
O8I6l 


610LL 
L888S 
Lvssl 
9PbI9 
ty698 


OSPLS 
61PLO 
L9L96 
6090 
[18¢¢ 


8£6L0 
06610 
70097 
t90t7 
COI9L 


ltrs 
velvl 
O8IIL 
SOLLL 
10¢tPr 


tS8HS 
IOLbL 
LC881 
$¢609 
C966 


PL988 
997 
66891 
866CL 
98716 


C806 
8SILS 
O8cb9 
SSS6C 
16¢c9 


St66L 
0697 
OvPLs 
80SSL 
10vS6 


£9097 
L966P 
L96S0 
OL69L 
tSLIc 


C1616 
CCICE 
6b09P 
ScCSLP 
StLLl 


6L596 
tPLss 
Pris 
80S TE 
88090 


Lttor 
[8ees 
OSO8S 
€£66S 
vS108 


CUBLS 
OSSct 
tLOcS 
BLLIt 
SCr66 


9CIIS 
Scr60 
SOCSP 
CcOSPE 
OCSLI 


pScOP 
9079 
COCLE 
9LEOC 
669CP 


t9CSC 
L9SC8 
£1880 
LLSvS 
[t6vL 


v609L 
prrco 
8SLPL 
COL6S 
vt808 


890P6 
68876 
0006 I 
O7@887 
CCHOL 


t9rl9 
18Z00 
L7v00 
8P86L 
O19~9 


L6v00 
60162 
COIL’ 
68PC7 
trLrl 


8E8P0 
88SL9 
Ie 186 
97868 
Ol86t 


SSIII 
9PI9C 
pO9P9 
tl6Ly 
CESCC 


OISrl 
Siete 
I6LS¢ 
OLO6E 
9SC6L 


P6L16 
teO0t 
S166P 
80S8r 
09L69 


8SSP0 
96117 
SCEST 
tLS86 
LITOS 


V89C8 
17L66 
t9Prc9 
VLCCr 
P80L7 


Iplve 
teSps 
68SLS 
OCLL6 
996¢ 1 


$169 
c90L9 
L6LbS 
9PLcS 
ILSts 


[8écs 
6SrIt 
ttepe 
St8r0 
6L778 


6L780 
870S0 
8IP6t 
COSCr 
896IP 


98t8L 
véSII 
IL6r1 
tt08L 
£7096 


vOrs0 
1L8t6 
OSTtZ 
9Z0SC 
LS7@vO 


tpL9e 
C89IP 
CO9CL 
88S 16 
67806 


CISs8p 
6179S 
OTI9S 
¢0880 
LOtCr 


ILtOL 
I6cle 
ELE8c 
[tpsp 
SPCCe 


CILS6 
CrLrl 
OPO8t 
C169 
C86LP 


Lp99e 
16tZ0 
OS¢es9 
860S7C 
€86S8 


S6L6¢ 
t06Z1 
60981 
6$S68 
CELOS 


£96¢6 
I9TL6 
9CSHC 
16¢60 
9P019 


Lve9ae 
eSels 
9OLL8Z 
V1806 
£9870 


L0098 
9P9LI 
c99ES 
CHCCE 
c9IS9 


O68It 
99°66 
p9OSIs 
trs6l 
6085S 


8819C 
00828 
009¢6 
ISSOL 
O8r80 


80I1SS 
16971 
86StL 
00062 
tVcs9 


8CIOL 
06S86 
S90Lt 
P8t0c 
ICppe 


TLt0S 
6L6b8 
6106¢ 
SCLBC 
LLISt 


LI@Cl 
C0607 
PILI6 
Lt reo 
£9¢06 


Crt co 
00866 
60898 
887r9 
9168t 


Orsi 
£00t0 
por9ote 
€L689 
PS8tC 


vL69L 
LOL6h 
910¢9 
8eSOr 
88L6L 


6078 
LLIOP 
L8SCP 
60¢91 
CS8b9 


66-S6 6-06 68-S8 8-08 62°SZL %Z-0L 69-S9 9-09 6S-SS PS-0S b6YSbh Pr-Ob GE-SE PE-OE 67-SC PC-OC 6I-SL vl-OL 60-S0 ¥£0-00 


(panuljuod) AIXX 2/981 


68 
88 
L8 
98 
$8 


v8 
t8 
C8 
18 
08 


6L 
8L 
LL 
9L 
SL 


vL 
tL 
CL 
IL 
OL 


69 
89 
L9 
99 
$9 


v9 
t9 
c9 
19 
09 


Statistical Tables and Charts 669 


Chart |. Power Functions of the Two-Sided Student’s t Test 


This chart shows graphs of power functions of the two-sided Student’s ¢ test. 
There are two graphs of power functions corresponding to two levels of signifi- 
cance w = 0.01 and 0.05. Since the distribution of tis symmetrical about zero, 
the two-tailed levels of significance also represent one-tailed values of 0.005 and 
0.025. For each graph, curves are drawn for values of df= 1, 2, 3, 4, 6, 12, 24, 
and 00, the degrees of freedom associated with the variance estimate. The hori- 
zontal scale of the graph represents the noncentrality parameter 6 and the vertical 
scale the corresponding power. For example, to test the hypothesis Ho: u = [Uo 
against the alternative H,: 2 = /41, the statistic t = (X — uo)/(S/./n) is used. 
The distribution of t when Hp is true is the ¢ distribution with df = n — 1 degrees 
of freedom, and the critical region t > t{df, 1 — w] would have a significance 
level a. Now, under Hj, the distribution of t(5) = [(X — ,)+ 60 /./n]/(S/JSn) 
is noncentral t with the noncentrality parameter 6 = (4; — 10)/(o//n). The 
power of the test is given by P{t(d) > t{df, 1 — w]}. For example, for a two- 
tailed test with a = 0.05, df = 6, and 6 = 3, the corresponding power read 
from the graph is approximately 0.70. 


670 The Analysis of Variance 


a= 0! 


Power (in percent) 
0.01 


| TARURNEG 
] SS 
NSU 


NOS 


5 NANNY Pt 


itil WY ITT TREE TTT 


toot | | | 


TENSES Cini ath ee 
TALL seit fe ed Ep nee TT 


x NNT TTR UU 
PHETTETEETANS NUTT TT 
: HVBBEBBRIIANSSCQSSCAUDSOOEIELUELU DDG 
o HUTTE NANT 

VAHAEREAASAASDSCUOSUHDE CHAU ESET 


MITATTHTTBTANSASASSSUS LTRS 


e HENS k PERCU gee 


TLE UTE TENN ANS Bi xs 


a 


7 ETHIER TN 


ieee 


a y 
ZiA Vit 7 
TIAL A | 


99.99 
0 


Statistical Tables and Charts 671 


Chart | (continued) 


a= .05 


Power (in percent) 
0.0! 


iN seni 


LRAT TTT 
ELLE 


20 . 

: lath, anges CT iti 
LELLLUA BNET EE 

o HUTT LNAI TSE aieet 
PELLET EERAAN TLE ete a 


DEUS NEE 
oe HUTT NANE: IS 


PET TTT TT NAAN, LAN ET TT 


q xe 
» PLECEEECEETEL NNR SNUTE ICTS 
“TUTTO LTE 
| | 


HERP RRS 


.N 
BERR RK 
Ls . \s TUTTLE 
N AS N YS 


HERE TARA R ERRATA \ BRNUHESSERER SEED 


i UNITNUITD 


From D. B. Owen, Handbook of Statistical Tables, © 1962 by Addison-Wesley. 


Reprinted by permission. 


672 The Analysis of Variance 


Chart II. Power Functions of the Analysis of Variance F Tests 
(Fixed Effects Model): Pearson-Hartley Charts 


This chart shows a set of graphs of power functions of the analysis of variance 
F tests in a fixed effects analysis of variance model. There are eight graphs 
for eight values of the numerator degrees of freedom v; = 1 (1) 8. For each 
value of v,, there are several values of the denominator degrees of freedom 
v7 =6 (1) 10, 12, 15, 20, 30, 60, oo. Each graph depicts two groups of power 
functions corresponding to two levels of significance, ~w = 0.01 and 0.05. The 
horizontal scale of the graph represents the normalized noncentrality parameter 
(@) and the vertical scale the corresponding power (1 — 8). There are two x- 
scales depending on which level of significance is employed. For example, for 
vy = 2, v2 = 6,a = 0.05, and ¢ = 2, the corresponding power read from the 
graph is approximately 1 — B = 0.66. 


673 


Statistical Tables and Charts 


iw) 
39) 
rm 
Ww 
NJ 


G'L-— (G00 =% 0)) ¢ 


5 
So 
Ml 
3 
me 
So 
= 
~e- 
Ww) 
ar WN 


Ge ee SS A DR RI (en ASE ee (el eae ee acces eae ee a eee ee 
0¢0 es (a RA Sea) are Ea nme RE (aa (ea (sl Se cs ee es 
Ob0 (oe a RS RR ae a eR a ee ee es a 
0s PMc slice Adlies “Sees le se NecenMlice ote salle celles, Geechee, cela calte Te silews eete salen a) 
[a a Se ee Ge Ge ee ee ee ee ee ee 
oa ES T(E LN GRE! Cen SE (RST OS ON] GN VO RET] (SE GN (NL OY OS oc 
wot | | ftp ft tt fT ol ae MOMMA JS oy: 
ea a ee TUT OF OV PS \f LALLA... 020 
fF} —_ — tk ERED Ff SSDS ADA ALPL LA 
a Seer a) aS ee as ee eee ee ee Pe ees ee é A, SDP IND AL LDL ta) 
as a as ees res es es ee ee ee ee ee eee Be lf If Aff ftsfy |_| 
CO ot i ee a de ree AOR DP DIT DL INL W LL 080 
A CE LN AY NG A SA A AGN] A NS A, os A AO A 1 AC 
OO PO | | A ALA ALIVS AZ| | 
eZ [fo |e ef eos /focso| | | |... 
060 at ee —T ADL VLD LITT E060 
¥ a ZY pee a PP 
6 fx | | | f 


Ele TE Ze VV IV el ae 

Yeve | [| VY VVWVATATN I TTT leo 
r (Ramee ee Ma A ee A a A 2 a ee 

960 ET 4 AEA EY AHH +++ 960 
es a aan a RD A: A SAVE A Sal A” 0 VA A A A A i 

ol tt tt IU ly AV KY TATA go 


SS LT” A A A A 6 A MS NN ON AS NY A 0 A A CY A A A  A S N S 
SF a) NS PA 6 A AY A” a CE AR? A EY OY 2 A 2 GY SY A A SE MP LS RS 
AE SL NE A A A A A WS ME NES YANG OY ES OY A" A AO A, A AS SN MS TS 
i aT ae a a”, a, a SY A A Ee ee a ee i ae ae 2 ee | ey 0 ee ee ee ee ee 
Fe Se dM SY A" AP AY A YG A AS NT A AT GY A A AF 2G NS EEN 7 
Ee ee ee ee a 
lof =f ALAL YALA te PII De i 
PAI ZI ZA Ae 
AV KAI CAAA A VP APY ASK tT tt =i 
eID ZL AS ATA le DAT 660 
2 9 


9 8 6 Ol 2l GLOz Of 09 © 3 2 8 6 OL et Sl OZ0f09 =H 


The Analysis of Variance 


674 


\ 
? 
N 
R 
3 
| 
| 
| 
| 


$ SVS HS Lf Vf fl NZ 


Ovo | es Se ee ay eee eee es Ea 
020 |_| ae es ee as ae eee ae ae 
5) al a a Ta SY Cay ees ee 3 
Ovo me oy 
ed fe ey ei (eek nee, Sieg ec tos ZH 
090 Pr ai dn eee Ne eee ea LALLA Lf, 
a ee ee ee ee ee OA 
ie Wedel he fe I ie we ed 
fee ee ln ee I ee fe 
ff an aa 
ee fe a ee ee ee eet A 
aes ee ee Eee eee ee ee 


N 
\ 


or 7_7_T_|_[3 | 2 8° lot zi silocoroa® | | V7 
60 ae PZ ZZ Vd eZ 
960 a — 7 I7 |7 V/J-Z7IZ A I TT TIT VV FA 960 
i oe ee ee 7 lV VeJIZVA/T/ I | TV AV Ae 
160 a ae (ae a A is A a a a a 160 
tT ET A | —~—EXx%Xx_[x_ x: EIIE_IIIT I 
2 ee ee A CE VY A OY AY A A A A FT | | JT/7 AJAZJJ7AS/S/Y/7J/Z{L | j | tf 
pp tO 
Pe ea A SS AL” SY GA 1 NY DC 7 AY AAO A A A YF A 2 1 lS RC NT A OR 
pt DOOD OTT TT OD OT TT —__L_ ITJT(#Y_ T 7 — Tr 
ee ae (RE (i NS LA Al O'My 
Hae ae 6A A 7S GS en a a a Ee fe! 
ee a a a a a PA a Pe ee ee ee ee ee: ae ieee 2 =k 
A A AY 
9 l @ 6 OL a2 SL 020f09™ 9 2 8601 Sl OF wHahk 


el Oc 09 
(panurjuod) || Wey 


675 


Statistical Tables and Charts 


wo 
wr 
m= 


L -— (100 = % 10)) $ 
= % 10}) $ ——+ 


wo 
oO 
io) 
— 
moss 
N 
-_ 


OL'O ISERSPSPEK! (SUR SCSI OSSUURERT GPS! WUNRSR! MAGES! (es (Ace Ee) ER RR A: PE ES SE ee es ee Pe Ovo 
020-4 i (ae a ee (ee ee ee ee ee ee ee ee ee | 036 
a Oa A A cA 
Ob0 A De 
es i ae Po ea Py». 
ae eer! NS RS RR ae a A A S277 7/7 
2 esd a oT ALA 50 
A NA 07, VAY A AV A) A, / 7, 
ooh ar ret at ote MT Noo 
a A SL SA A Soy A Y-4 WY. AY AY A A AY AY A A SS SR A 4 LLL LLL AM ————} 
Tj; _y | | |. | | | | [TTA TAT ASL JT JV 7 7) LT VAAL LAL 
Pode le le ee a a VP OD I LAI | J PITAL Pl 
080 pL ASD SLID At AD DANA Fo ogo 
ee ee ee ae A | (en ee a 2 ee 
a A he LLAL/Z |} TC 
Se ea DV I lee, 
ptt LL A AL z 
Pe (ae) eae (nee eC as es Os a) a | gS refoysoe/oo | | | die a 
ee ae Se) a ee eee Se (A ae A ee a | L LIS SYA SAL en Peete eee = 
ago VT Tp tg 1 
VV TV TIS TV ATT MNT pe aes 
port ~__| [| | | VIAASY FASTA | ALY //V LALLY ae ee a 
ceo | [| [ [ [oo [a Te le o at siloc oe ose | | ZV AVLAL YL ae ae a Pe 
Ty | | | | |;[/7 I VV yY/7iT/7i7 7 7/7 Vy [TT YY ASAP TV SAPS | | 
666i tT aay Ge. a ae a ae a a a A A A a a 2 ee ee ee 
TAT A FF EF A 
Se Ve a ea | 
260 SS CE ES A A | A 2 28 8 A OY | A 0 A A A A A GS 260 
2 CR IE Ee 7 Se De, y—_f—_#$+_--+ #1} 2 EE 2 Se A ee 2 2 A je Ok 2 0 | 
oe ee PE ee ey Ge A | BE ee Fe an on os od | ee 1 A 2 ee 2 Se 2 ee es ee Ee 
a ee GE Se Cy 4 a a 4 7 ae FF ama Sey / aa A WY SE CY DY 2 GY A A | Oe ae 2 2 a Pe 
g60 a EEE GR: AY 2 AL (FAW 2 YY | a Slay 2 A Se TY A a Se | A | A SS es ae Fy 
Rae SP NNER CINE! EY” AN CE 7 SE DS 0 A S'S | ey A YS We 2 | a Pa 
es A SS AY AGN MY A OY AY A A AY A A A | OY A A A A 
BAA AA 
TT T T/T T7T A ASI SI iy a Of ae nae ee ne 
Po (i Cees | LY I/i// PY tas fs ee 
2 


8 6 OL al St O02 O0£ 09 ~ 2 8601 Sl Of wou” 
ZL 02 09 


(panuyzuo) || WeYyD 


The Analysis of Variance 


676 


| =— (100 = % 10))¢ 


A a Se RE ee ES eS SR A OEE] EN RENN GOAN eae (RT oro 
| aa a Fg] A (as aT Sa aT Ca A CR a | 
O¢0 | eee ee (eae Sa eee Decies RESE s ae e ee aA | —$ BB Ze 
ee a a ae ee a ee ee ee ee ee ee ZA oo 
0s0 ios ape ies Ee ee ee SKA y 0S0 
09°0 eae Sara a A (GI a ea a ee ee ee ee a ae SOTY A ey 090 

hi ee ee er ee LV LV LV LV | An f+ 

att. DOT OOOO oat ele oljetsi fosc7~ | | VAY | 

020 pf LS SH A 7. vv 7 V7. U | JIS SAL S/S JV S/S /4..._. i 02°0 

pO I AAA FEE  -E_E oD ODE a 

a ee ea CE eee Gs eee es ee ee (ee ee ee ee ae ae a ees ee 7” 2 ee en ee ae 4 ae 2 2 re 
sp (ee A i VG 

Spe ee I = er 

ae (aa (een (eas ieee: (ae a a ae ee er a, TVS] oT. UY ASAI SV //f | dC 

Ca CA RN (a Canna (TR (ee (Se ey, a a A a ee Se a a 

oe sR (SS OG 0A, YAY AY A a) MeV A 0 a ee 
FP i oa oe Sal SO A 

ae a es a (ee eee De ees ae ee 


LIZVA ALVA ol ls Asfofsi/oeyor | | | toe 
VF A TIT VY ATT SA LT nm 
LV SYLALS SV LL 


V 
|} —f} + + + 4 4 AA a LL SZIS ALLY LS ee 


jee ST - 


ee oF a wa tthto yO a A RN 
| fd /—_}  -JPJi fy fie pip gp 


02 09 


J8M0q 


=1 


(panujjuo>) || 4eYD 


677 


Statistical Tables and Charts 


S b ¢ 2 L ——— (100 = » 10)) ¢ 
¢ 


BBA Ne 
LAAN | | GZ seo 


JMO 


NY 
Be 
ae 
h 
ft 
Ba 
an 
an 
te 
u 
" 
: 
S 
ii 
oS 8 
Oo & 
fae L = 


MAN TTT 


ee 
Reseed 
eee 
= 
or 
ae 
are 
= 
( 
aes 
ey 
ae 
aaa 
oes 
Pe 
a 
= 
A 
wae 
Fis 
aa 


A A AY A A A 776 
[2 (gs le o at st_oeloe dace A AS A/S /V IY Foe 


V 
A 


AAA DAA, 04 0) c(i 


S$ 
© 
a 
el 
a 


OL eb St O02 O€ 09 © L 8 6OL2L SL OF C= 
02 09 


(panuljuo2) || WeYD 


678 


Chart II (continued ) 


The Analysis of Variance 


zg 5. 5 SBS YB 2 © 8 8 ease 
oe) Oo OOO C0 SO ro) Oo C86 OOCOOO 
SUR ERIM 
“ARCMIN TTT 
SSH 
“CORSO PN 
CSTR 
JCRATTENTTL TET 
“CHASTISED MTT TTL 
CESTIMESTENSHCON NUTT TTT 
SCH ATTRA TESSTNSS TIT NTT LTT 
a CES ITE NSN 
SCOPES TESTES SISNET 
CEES HCH NRENNN TT 
OCCU CHESS SEE 
COCTTTICETE TORS CSET 
ETM ITS SANG 
HTT SSNY 


AEE MC 
SCRE TTT 
eC RSENS TTT TTT 
SOPHIE RSENS TET TL 
2 ETOH E SST 
COPIES ASSES eR SSRN LT TL 
OUTER aaa SSRSSSSSSSSNNN ET 
Oe ete SSSSSSSSNNTE 
ALTE TTT LTTE 
PERS 


0.99 
0.98 
0.97 
0.96 
0.95 
094 
1 092 
, 090 
080 
0.70 
0.60 
050 
040 
030 
020 
0.10 


| = J8M0q 


= 001) —~ 1 


p (for o 


679 


Statistical Tables and Charts 


bv ¢ 2 L=— (100 =% 10)) > 


ATT Be 


— a= ‘ME Mt 0 0 mia 


a A OL A A A AV A 
i LV a 


JaM0d 


| 


Sy GL 28 ey 2 A ey 2 ey A Oe a Ge 7 | EY SY 2 GEE 2 4 A 2 ME 260 
1 YT A a A A A A A | 


1604 AAT ST AAA A 


96° oF ae ee 5 es ee OE | ee © SY | a | 
i ae ee ee Se | | ee | | ee Se A 
22 aes © ee ee ae, ey A ey a ee ee A 


d 8 6 OL eb St O2 OF 09 © 9 2 8 6 OL at SI 02 0€ 09 ~= 
( aa 1} }4euD 


The Analysis of Variance 


680 


‘uolsstuned Aq payuday ‘Og I-ZII “(1S61) 8e ‘Pydawolg ,/uonNqusiq-4 
]8.1JU99-UON JY) WOI paaliag ‘S}say, BouLLIRA JO SIsATeUY dy) JO UOT]OUN, JaMOg dU} JO suRYD,, ‘Ao[WeHY ‘OC 'H pure uosieag *§ “q WO 


b ¢ 2 L — (100 =% 10))¢ 


O10 ores a 
020 nee ee 
ran Ewe a 
se Eo 
050 rea Soe 
090 an eee 

a ar, 
02°0 | | | | 9g | 


AY 
= 
re, Oe ae A vA ; 
2¢0|-1_|_ Hy ~ 
60 | {| | Yj YF (ee ay a | 9 SC UAV A VL 
S60 fete Ee A 
rf | fF | f | f fy fF | | fj ff ff f/ijf/jf/if fy fj f | 
960 | | | | | 
| —t—-F--F$ tt t AANA A tt 


ff y Hp PY 


: 
= 
re) 
oO 


TA A 
{/¥ AY Ay ge 


NTA TY ET 660 


6 Ol at St 02 OF O900=% 
(panuljuod) \j weyD 


oh} ELETHTE 
Sth 
Bane 


oh 


f 
6 Ol S102 Og O09 © 


Statistical Tables and Charts 681 


Chart III. Operating Characteristic Curves for the Analysis of 
Variance F Tests (Random Effects Model) 


This chart shows graphs of curves giving 1 — power for the analysis of variance 
F tests in a random effects analysis of variance model. There are eight graphs 
for eight values of the number of degrees of freedom v; = 1 (1) 8. For each value 
of v,, there are several values of the denominator degrees of freedom v2 = 6 (1) 
10, 12, 15, 20, 30, 60, 00. Each graph depicts two groups of operating character- 
istic Curves corresponding to two levels of significance a = 0.01 and 0.05. The 
horizontal scale of each graph represents the parameter A = ,/1/(1 + no2/o2) 
and the vertical scale the corresponding probability of accepting the hypothesis 
(1 — power). There are two x-scales depending on which level of significance 
is employed. For example, for v;} = 2,v. = 6,@ = 0.01, and A = 7, the 
corresponding power read from the graph is approximately 1 — 0.20 = 0.80 


The Analysis of Variance 


0.05) 


! CT 
fi 8 MK MK rN) 
-_ S N 
se LTE | TT ee: 
+ Ht 
& 8 mH 
~ N =_ 
By 
Le 
82 
& 8 ae 
& AN : 
4 3 = = 
° 
~~ 
3 
8 
|_|3¢ “a 0 oe 
Ba xy 
bY 
3 gS he Y ~ 
: i 
{ "4 
: : -! 
° o 
Q00000 0 OmNneny Mm Aw a It | 2 
OROM tT mM A 236853 8 8 Oo 38 ot Il 
aSGOS000 O 6ecocd0 oOo oO S x 238°88s8 9 g 28588 a 9 x Bo a 
BUPSUNn DY Pun ntoeren Pe allidedetd A as Races dias aaataeien 2 ~ $ 
! “A 


Statistical Tables and Charts 683 


Chart III (continued) 


EE A \S. a 
060 we | Wer 
= QaNS 


S or 


1.00 
Na. ae a Kay oe A a 
0.80 a “a a a eS es ee ee! ee 
060+ 2 Ce ee ee 
oso} —H ma \ \ [Se aay (Re (ea (eats ee ea) 
ma \ —T WS —->--+ +++ 


Ae 


ns Vi 


f ooe| — —t-—_ Wiss we i BN SS 

5 0.06 Ss NE BANNAN AN WAAR AS Ae) eel ees ae 

pe ee NNN Be eo: BANNAN NOC 

; 004 — —-¥ A ARR 
a HEN AXES NEE | 


7 
KZ, 
f 


BEER \CA\\\ NSS 


we ra= eer 9 10 ll 12 


684 The Analysis of Variance 


Chart HII (continued ) 
eee 


0.70} -—_—_—__}___ BEC ee ee ee ee eee ee ee ree 
hs or Ne 


5 aon aoe 


ee. 


epting the hypothesis 


6 


25] WSS a ENS 
eu AX Ns 20 
| |W “ 


can \ s heat 


Probability of acc 
°o 


es ra= eee: 3 8 9 10 
ot = TR [ae ees Ge ee (ee 
2 Sooe ee ===—= 
im a ed a ae a 

#0301 WN \ ir oe ee ee ee 
‘ i a rE 
§ 010] m\ i NER 
S G06 Ze Bes ne 
Bee ME BNW Pe ora 00s ANY Oa eS 
foce| — os ; (Sa ea pee 
“ef SSE oS re 
- aa NANNG 
TOES 


A (for a = ae ; 6 8 9 


Statistical Tables and Charts 685 


Chart Il (continued ) 
ot r= | Il 1 _ 


he 
MAW 


050 EW We 
a y TF WW 


CNA 


a Is 
ey 


(0) 
oe eee i 
Se eS 
0.05 

0.04 : 

0) 


Probability of accepting the hypothesis 


a 
ae ras eee 2 3 6 7 8 


9 = oe mee Ser ons oe 

- ioe as 
iN a se Ss Ga 
‘a 
yet NW = Wp 
5° oe F* AB So 
5 0.05 — NS 16 A 
8 0.04 am WN s ig , AIAN NA 


P 
° 


mam : AWK 


CANCE 


(for a = 0.01) —> 1 
From A. H. Bowker and G. J. Lieberman, Engineering Statistics, 2nd ed., ©1972 by 
Prentice-Hall, Englewood Cliffs, New Jersey. Reprinted by permission. 


0.01 


686 The Analysis of Variance 


Chart IV. Curves of Constant Power for Determination of Sample 
Size in a One-Way Analysis of Variance (Fixed Effects Model): 
Feldt-Mahmoud Charts 


This chart shows graphs of curves of constant power for the analysis of variance 
F tests ina fixed effects analysis of variance model. The graphs give the values 
of n (y-scale) as a function of ¢’ = + ae a? /r for specified values of the 
number of groups r, the level of significance a, and the power P(1 — 8). Each 
graph depicts two groups of curves corresponding to two levels of significance, 
a =(.05 and 0.01. The graphs are given for r = 2, 3, 4,5; and for each value 
of r, the values of P used in drawing the curves are equal to 0.5, 0.7, 0.8, 0.9, 
and 0.95. There are two x-scales depending on which level of significance is 
employed. For a given set of values of r, a, P, and ¢’, the sample size n may be 
read from the ordinate of the graph. For example, for r = 3, a =0.05, P =0.7, 
and ¢’ = 0.3, the value of n read from the chart is approximately equal to 29. 


Statistical Tables and Charts 687 


“AN 
La KANG 


= 
= 
bat 
= 
eee 
SS 


rae ee as 
z : or =. 


688 The Analysis of Variance 


Chart IV (continued) 


srses| | staoss 
Wa 
AMT | AMY 
AMY FAW 
BANNAN 
ANG 


From L. S. Feldt and M. W. Mahmoud, “Power Function Charts for Specifying Number 
of Observations in Analysis of Variance of Fixed Effects,’ Annals of Mathematical 
Statistics, 29 (1958), 871-877. Reprinted by permission. 


References 


Abraham, J. K. (1960). Note 154: On an alternative method of computing Tukey’s 
statistic for the Latin square model. Biometrics, 16, 686-691. 

Afifi, A. A. and Elashoff, R. M. (1966). Missing observations in multivariate statistics, 
I. Review of the literature. J. Amer. Stat. Assoc., 61, 595-604. 

Afifi, A. A. and Elashoff, R. M. (1967). Missing observations in multivariate statistics, 
II. Point estimation in simple linear regression. J. Amer. Stat. Assoc., 62, 10-29. 

Akutowicz, F. and Traux, H. M. (1956). Establishing control of tire cord testing labo- 
ratories. Indus. Qual. Contr., 13 (2), 4—S. 

Alexander, R. A. and Govern, D. M. (1994). A new and simple approximation for 
ANOVA under variance heterogeneity. J. Educ. Stat., 19, 91-101. 

Algina, J., Blair, R. C., and Coombs, W. T. (1995). A maximum test for scale: Type I 
error rates and power. J. Educ. Beh. Stat., 20, 27-39. 

Allen, R. E. and Wishart, J. (1930). A method of estimating the yield of a missing plot 
in field experimental work . J. Agri. Sci., 20, 399-406. 

Anderson, R. L. (1946). Missing-plot techniques. Biom. Bull., 2, 41-47. 

Anderson, R. L. (1954). Components of variance and mixed models. In: Quality Con- 
trol Convention Papers: Proceedings Eighth Annual Convention, pp. 633-645. 
American Society for Quality Control, Milwaukee, Wisconsin. 

Anderson, R. L. (1960). Use of variance component analysis in the interpretation of 
biological experiments. Bull. Inter. Stat. Inst., 37, 1-22. 

Anderson, R. L. and Bancroft, T. A. (1952). Statistical Theory in Research. McGraw- 
Hill , New York. 

Anderson, R. L. and Houseman, E. E. (1942). Tables of orthogonal polynomial values 
extended to VN = 104. Res. Bull., 297, Ames, Iowa. 

Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed. 
John Wiley, New York. (1st ed., 1958.) 

Anderson, T. W. (1985). Components of variance in MANOVA. In: Multivariate Ana- 
lysis IV, pp. 1-8 (Ed. P. R. Krishnaiah). North-Holland, Amsterdam. 

Anderson, V. L. and McLean, R. A. (1974). Design of Experiments: A Realistic Ap- 
proach. Marcel Dekker, New York. 

Andrews, D. M. and Herzberg, A. (1985). Data: A Collection of Problems from Many 
Fields for the Students and Research Workers. Springer-Verlag, New York. 

Anionwu, E., Watford, D., Brozovic, M., and Kirkwood, B. (1981). Sickle cell disease 
in a British urban community. Brit. Med. J., 282, 283-286. 

Anscombe, F. J. (1948). The transformation of Poisson, binomial and negative binomial 
data. Biometrika, 35, 246-254. 

Armitage, J. V. and Krishnaiah, P. R. (1964). Tables for the Studentized Largest 
Chi-square Distribution and their Applications. Tech. Rep. No. ARL 64-188, 
Aerospace Research Laboratories, Wright-Patterson Air Force Base, Dayton, 
Ohio. 


689 


690 The Analysis of Variance 


Arteaga, C., Jeyaratnanam, S., and Graybill, F. A. (1982). Confidence intervals for 
proportions of total variance in the two-way cross component of variance model. 
Commun. Stat., A: Theo. & Meth., 11, 1643-1658. 

Arvesen, J. N. and Layard, M. W. J. (1975). Asymptotically robust tests in unbalanced 
variance component models. Ann. Stat., 3, 1122-1134. 

Arvesen, J. N. and Schmitz, T. H. (1970). Robust procedures for variance component 
problems using the jackknife. Biometrics, 26, 677-686. 

Aster, R. (1994). SAS Foundations: From Installation to Operation. McGraw-Hill, 
New York. 

Atkinson, A. C. (1982). Developments in the design of experiments. /nter. Stat. Rev., 
50, 161-177. 

Bagui, S. C. (1993). CRC Handbook of Percentiles of Noncentral t-Distribution. CRC 
Press, Boca Raton, Florida. 

Bailey, B. J. R. (1977). Tables of the Bonferroni ¢ statistic. J. Amer. Stat. Assoc., 72, 
469-478. 

Bancroft, T. A. (1968). Topics in Intermediate Statistical Methods, Vol. 1. lowa State 
University Press, Ames, Iowa. 

Bankier, J. D. (1960a). Operators and the r-way crossed classification. Amer Math. 
Month., 67, 841-846. 

Bankier, J. D. (1960b). An operational approach to the r-way crossed classification. 
Ann. Math. Stat., 31, 16-22. 

Barnett, V. D. (1962). Large sample tables of percentage points for Hartley’s correction 
to Bartlett’s criterion for testing the homogeneity of a set of variances. Biometrika, 
49, 487-494. 

Barnett, V. D. and Lewis, T. (1994). Outliers in Statistical Data, 3rd ed. John Wiley, 
New York. 

Bartlett, M. S. (1936). The square-root transformation in the analysis of variance. J. R. 
Stat. Soc., Suppl., 3, 68-78. 

Bartlett, M. S. (1937a). Properties of sufficiency and statistical tests. Proc. R. Soc. 
London, Ser. A, 160, 268— 282. 

Bartlett, M.S. (1937b). Some examples of statistical methods of research in agriculture 
and applied biology. J. R. Stat. Soc., Suppl., 4, 137-183. 

Bartlett, M.S. (1947). The use of transformations. Biometrics, 3, 39-52. 

Bartlett, M.S. and Kendal, D. G. (1946). The statistical analysis of variance: Hetero- 
geneity and logarithmic transformation. J. R. Stat. Soc., Ser. B, 8, 128-150. 

Beall, G. (1942). The transformation of data from entomological field experiments so 
that the analysis of variance becomes applicable. Biometrika, 29, 243-262. 

Bechhoefer, R. E. and Dunnett, C. W. (1981). Multiple Comparisons for Orthogo- 
nal Contrasts: Examples and Tables. Tech. Rep. No. 495, School of Operations 
Research and Industrial Engineering, Cornell University, Ithaca, New York. 

Bechhoefer, R. E. and Dunnett, C. W. (1982). Comparisons for orthogonal contrasts: 
Examples and tables. Technom., 24, 213-222. 

Beckman, R. J. and Cook, R. D. (1983). Outlier.......... s. Technom., 25, 119-149. 

Bennett, C. A. and Franklin, N. L. (1954). Statistical Analysis in Chemistry and the 
Chemical Industry. John Wiley, New York. 

Berry, D. A. (1987). Logarithmic transformations in ANOVA. Biometrics, 43, 439-456. 

Birch, N. J., Burdick, R. K., and Ting, N. (1990). Confidence intervals and bounds 
for a ratio of summed expected mean squares. Technom., 32, 437-444. 


References 691 


Bishop, D. J. and Nair, U. S. (1939). A note on certain methods of testing for the 
homogeneity of a set of estimated variances. J. R.. Stat. Soc., Suppl., 6, 89-99. 

Bishop, T. A. and Dudewicz, E. J. (1978). Exact analysis of variance with unequal 
variances: Test procedures and tables. Technom., 20, 419-424. 

Blackwell, T., Brown, C., and Mosteller, F. (1991). Which denominator? In: Fun- 
damentals of Exploratory Analysis of Variance, pp. 252-294 (Eds. D. C. Hoaglin, 
F. Mosteller, and J. W. Tukey). John Wiley, New York. 

Blischke, W. R. (1966). Variances of estimates of variance components in a three-way 
classification. Biometrics, 22, 553-565. 

Blischke, W. R. (1968). Variances of moment estimators of variance components in the 
unbalanced r-way classification. Biometrics, 24, 527-540. 

Bliss, C. (1967). Statistics in Biology, Vol. 1. McGraw-Hill, New York. 

Boardman, T. J. (1974). Confidence intervals for variance components-A comparative 
Monte Carlo study. Biometrics, 30, 251-262. 

Bock, R. D. (1963). Programming univariate and multivariate analysis of variances. 
Technom., 5, 95-117. 

Bolk, R. J. (1993). Testing additivity in two-way classifications with no replications: 
The locally best invariant test. J. Appl. Stat., 20, 41-55. 

Boneau, C. A. (1960). The effects of violation of assumptions underlying the t-test. 
Psychol. Bull., 57, 49- 64. 

Boneau, C. A. (1962). A comparison of the power of the U and t tests. Psychol. Rev., 
59, 246-256. 

Bose, R. C. and Shrikhande, S. S. (1959). On the falsity of Euler’s conjecture about 
the nonexistence of two orthogonal Latin squares of order 4t + 2. Proc. Nat. Acad. 
Sci., 45, 734-737. 

Bowker, A. H. and Lieberman, G. J. (1972). Engineering Statistics, 2nd ed. Prentice 
Hall, Englewood Cliffs, New Jersey. 

Bowman, K. O. (1972). Tables of the sample size requriments. Biometrika, 59, 234. 

Bowman, K. O. and Kastenbaum, M. (1975). Sample size determination in single and 
double classification experiments. In: Selected Tables in Mathematical Statistics, 
Vol. 3, pp. 1-23 (Eds. H. L. Harter and D. B. Owen). American Mathematical 
Society, Providence, Rhode Island. 

Box, G. E. P. (1953). Nonnormality and tests on variances. Biometrika, 40, 318-335. 

Box, G. E. P. (1954a). Some theorems on quadratic forms applied in the study of 
analysis of variance problems, I. Effect of inequality of variance in the one-way 
classification. Ann. Math. Stat., 25, 290-302. 

Box, G. E. P. (1954b). Some theorems on quadratic forms applied in the study of analysis 
of variance problems, II. Effect of inequality of variance and of correlation errors 
in the two-way classification. Ann. Math. Stat., 25, 484-498. 

Box, G. E. P. and Anderson, S. L. (1955). Permutation theory in the derivation of 
robust criteria and the study of departures from assumption. J. R. Stat. Soc., Ser. B, 
17, 1-26. 

Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. J. R. Stat. Soc., 
Ser. B, 26, 211-243. 

Box, G. E. P. and Cox, D. R. (1982). An analysis of transformations revisited, rebutted. 
J. Amer. Stat. Assoc., 77, 209-210. 

Box, G. E. P. and Draper, N. (1987). Empirical Model Building and Response Surfaces. 
John Wiley, New York. 


692 The Analysis of Variance 


Box, G. E. P., Hunter, W. G., and Hunter, J. S. (1978). Statistics for Experimenters. 
John Wiley, New York. 

Box, G. E. P. and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis. 
Addison-Wesley, Reading, Massachusetts. (Wiley Classic Edition, 1992.) 

Bozivich, H., Bancroft, T. A., and Hartley, H. O. (1956). Power of analysis of variance 
test procedures for certain incompletely specified models. Ann. Math. Stat., 27, 
1017-1043. 

Bradley, J. V. (1964). Studies in Research Methodology, VI. The Central Limit Effect 
for a Variety of Populations and the Robustness of Z, t, and F. Tech. Rep. No. 
AMRL-54-123, Aerospace Medical Research Laboratories, Wright-Patterson Air 
Force Base, Dayton, Ohio. 

Bratcher, T. L., Moran M. A., and Zimmer, W. J. (1970). Tables of sample sizes in 
the analysis of variance. J. Qual. Tech., 2, 156-164. 

Broemeling, L. D. (1985). Bayesian Analysis of Linear Models. Marcel Dekker, 
New York. 

Brown, M. B. and Forsythe, A. B. (1974a). The small size sample behavior of some 
Statistics which test the equality of several means. Technom., 16, 129-132. 

Brown, M. B. and Forsythe, A. B. (1974b). The ANOVA and multiple comparisons 
for data with heterogeneous variances. Biometrics, 30, 719-724. 

Brown, M. B. and Forsythe, A. B. (1974c). Robust tests for the equality of variances. 
J. Amer. Stat. Assoc., 69, 364-367. 

Brown, R. A. (1974). Robustness of the Studentized range statistic. Biometrika, 61, 
171-175. 

Brownlee, K. A. (1953). Industrial Experimentation. Chemical Publishing Co., 
New York. 

Brownlee, K. A. (1965). Statistical Theory and Methodology in Science and Engineer- 
ing, 2nd ed. John Wiley, New York. (1st ed. 1960.) 

Budescu, D. V. and Applebaum, M. I. (1981). Variance stabilizing transformations 
and the power of the F test. J. Educ. Stat., 6, 55-74. 

Bulgren, W. G. (1974). Probability integral of the doubly noncentral ¢ distribution with 
degrees of freedom n and noncentrality parameters 6 and A. In: Selected Tables 
in Mathematical Statistics. Vol. 2, pp. 1-138 (Eds. H. L. Harter and D. B. Owen). 
American Mathematical Society, Providence, Rhode Island. 

Bulmer, M. G. (1957). Approximate confidence limits for components of variance. 
Biometrika, 44, 159-167. 

Burch, L. and King, S. J. (1994). SAS Software Roadmaps: Your Guide to Discovering 
the SAS System. SAS Institute, Cary, North Carolina. 

Burdick, R. K. (1994). Using confidence intervals to test variance components. J. Qual. 
Tech., 26, 30-38. 

Burdick, R. K., Birch, N. J., and Graybill, F. A. (1986a). Confidence intervals on 
measures of variability in an unbalanced two-fold nested design with equal sub- 
sampling. J. Stat. Comp. Simul., 25, 259-272. 

Burdick, R. K. and Eickman, J. (1986). Confidence intervals on the among group 
variance component in the unbalanced one-fold nested design. J. Stat. Comp. Simul., 
26, 205-219. 

Burdick, R. K. and Graybill, F. A. (1984). Confidence intervals on linear combinations 
of variance components in the unbalanced one-way classification. Technom., 26, 
131-136. 

Burdick, R. K. and Graybill, F. A. (1985). Confidence intervals on the total variance 


References 693 


in an unbalanced two-fold nested classification with equal sub-sampling. Commun. 
Stat., A: Theo. & Meth., 14, 761-774. 

Burdick, R. K. and Graybill, F. A. (1988). The present status of confidence interval 
estimation on variance components in balanced and unbalanced random models. 
Commun. Stat., A: Theo. & Meth., 17, 1165-1195. 

Burdick, R. K. and Graybill, F. A. (1992). Confidence Intervals on Variance Compo- 
nents. Marcel Dekker, New York. 

Burdick, R. K., Maqsood, F., and Graybill, F. A. (1986b). Confidence intervals on 
the intraclass correlation in the unbalanced one-way classification. Commun. Stat., 
A: Theo. & Meth., 15, 3353-3378. 

Cameron, J. M. (1951). Use of components of variance in preparing schedules for the 
sampling of baled wool. Biometrics, 7, 83-96. 

Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A. (1983). Graphical 
Methods for Data Analysis. Wadsworth, Pacific Grove, California. 

Chao, M.-T. and Glaser, R. E. (1978). The exact distribution of Bartlett’s test statistic 
for homogeneity of variances with unequal sample sizes. J. Amer. Stat. Assoc., 73, 
422-426. 

Chevy, V. (1976a). Comparisons Among Treatment Means in Analysis of Variance. Tech. 
Bull., Agricultural Research Service, U.S. Department of Agriculture, Washington, 
D.C. 

Chew, V. (1976b). Uses and abuses of Duncan’s multiple range test. Hort. Sci., 11, 
251-253. 

Chew, V. (1976c). Comparing treatment means: A compendium. Hort. Sci., 11, 348- 
357. 

Christensen, R. (1996). Plane Answers to Complex Questions: The Theory of Linear 
Models, 2nd ed. Springer-Verlag, New York. 

Cleveland, W. S. (1985). The Elements of Graphing Data. Wadsworth. Pacific Grove, 
California. 

Clinch, J. J. and Keselman, H. J. (1982). Parametric alternatives to the analysis of 
variance. J. Educ. Stat., 7, 207-214. 

Coakes, S. J. and Steed, L. G. (1997). SPSS: Analysis without Anguish. John Wiley, 
Chichester. 

Cochran, W. G. (1937). Catalogue of uniformity trial data. J. R. Stat. Soc., Suppl., 4, 
233-253. 

Cochran, W. G. (1940). The analysis of variance when experimental errors follow the 
Poisson or binomial laws. Ann. Math. Stat., 11, 335-347. 

Cochran, W. G. (1941). The distribution of the largest of a set of estimated variances 
as a fraction of their total. Ann. Eug., 11, 47-52. 

Cochran, W. G. (1951). Testing a linear relation among variances. Biometrics, 7, 17-32. 

Cochran, W. G. (1954). Some methods for strengthening the common x? tests. Bio- 
metrics, 10, 417-451. 

Cochran, W. G. (1957). Analysis of covariance: Its nature and uses. Biometrics, 13, 
261-281. 

Cochran, W. G. (1964). Approximate significance levels of the Behrens-Fisher test. 
Biometrics, 20, 191-195. 

Cochran, W. G. and Cox, G. M. (1957). Experimental Designs, 2nd ed. John Wiley, 
New York. 

Cody, R. P. and Smith, J. K. (1997). Applied Statistics and the SAS Programming 
Language, 4th ed. SAS Institute, Cary, North Carolina. 


694 The Analysis of Variance 


Cohen, A. and Strawderman, W. E. (1971). Unbiasedness of tests for homogeneity of 
variances. Ann. Math. Stat., 42, 355-360. 

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd ed. 
Lawrence Erlbaum, Hillsdale, New Jersey. 

Collyer, C. E. and Enns, J. T. (1987). Analysis of Variance: The Basic Designs. Nelson 
Hall, Chicago. 

Conover, W. J. (1971). Practical Nonparametric Statistics. John Wiley, New York. 

Conover, W. J., Johnson, M. E., and Johnson, M. M. (1981). A comparative study 
of tests for homogeneity of variances with applications to outer continental shelf 
bidding data. Technom., 23, 351-361. (Corrigendum ibid., 26, 302.) 

Cooley, W. W. and Lohnes, P. R. (1962). Multivariate Procedures for the Behavioral 
Sciences. John Wiley, New York. 

Coombs, W. T., Algina, J., and Oltman, D. O. (1996). Univariate and multivariate 
omnibus hypothesis tests selected to control type I error rates when population 
variances are not necessarily equal. Rev. Educ. Res., 66, 137-179. 

Cornfield, J. and Tukey, J. W. (1956). Average values of mean squares in factorials. 
Ann. Math. Stat., 27, 907-949. 

Cox, D. R. (1958a). Planning of Experiments. John Wiley, New York. (Wiley Classic 
Edition, 1992.) 

Cox, D. R. (1958b). The interpretation of the effects of non-additivity in the Latin square. 
Biometrika, 45, 69-73. 

Cox, D. R. (1977). Nonlinear models, residuals and transformations. Math. Operations- 
forsch. Stat., Ser. Stat., 8, 3-22. 

Cox, D. R. (1984). Interaction. /nter. Stat. Rev., 52, 1-31. 

Crisler, L. (1991). Computer Based Data Analysis: Using SPSS-X in the Social and 
Behavioral Sciences. Nelson-Hall, Chicago. 

Crowder, M. J. and Hand, D. J. (1990). Analysis of Repeated Measures. Chapman & 
Hall, London. 

Crump, S. L. (1946). The estimation of variance components in analysis of variance. 
Biom. Bull., 2, 7—11. 

Crump, S. L. (1951). The present status of variance component analysis. Biometrics, 
7, 1-16. 

Cummings, W. B. and Gaylor, D. W. (1974). Variance component testing in unbalanced 
nested designs. J. Amer. Stat. Assoc., 69, 765-771. 

Curtiss, J. H. (1943). On transformations used in the analysis of variance. Ann. Math. 
Stat., 14, 107-122. 

D’ Agostino, R. B. (1971). An omnibus test for normality for moderate and large samples. 
Biometrika, 58, 341—348. 

D’ Agostino, R. B. (1972). Small sample probability points for the D test of normality. 
Biometrika, 59, 219-221. 

Damon, R. A., Jr. and Harvey, W. R. (1987). Experimental Design, ANOVA, and 
Regression. Harper & Row, New York. 

Daniel, W. W. (1990). Applied Nonparametric Statistics, 2nd ed. Brooks/Cole, Belmont, 
California. (1st ed., 1978.) 

Daniels, H. E. (1939). The estimation of components of variance. J. R. Stat. Soc., Suppl., 
6, 186-197. 

Das, M. N. and Giri, N. C. (1979). Design and Analysis of Experiments. John Wiley 
(Eastern), New Delhi. 

Davenport, J. M. (1975). Two methods of estimating the degrees of freedom of an 
approximate F’. Biometrika, 62, 682-684. 


References 695 


Davenport, J. M. and Webester, J. T. (1973). A comparison of some approximate 
F-tests. Technom., 15, 779-789. 

David, H. A. (1952). Upper 5 and 1% points of the maximum F-ratio. Biometrika, 39, 
422-424. 

Davies, O. L. (Ed.) (1954, 1956, 1960). The Design and Analysis of Industrial Experi- 
ments. \st, 2nd, & 3rd eds. Oliver and Boyd, Edinburgh. 

Davies, O. L. and Goldsmith, P. L. (Eds.) (1972). Statistical Methods in Research and 
Production, 4th ed. Oliver and Boyd, Edinburgh. 

Day, S. J. and Graham, D. F. (1991). Sample size estimation for comparing two or 
more treatment groups in clinical trials. Stat. Med., 10, 33-43. 

DeLury, D. B. (1946). The analysis of Latin squares when some observations are miss- 
ing. J. Amer. Stat. Assoc., 41, 370-389. 

Dempster, A. P., Lord, N. M., and Rubin, D. B. (1977). Maximum likelihood from 
incomplete data via the E. M. algorithm (with discussion). J. R. Stat. Soc., Ser. B, 
39, 1-38. 

Dénes, J. and Keedwell, A. D. (1974). Latin Squares and Their Applications. Akadémia 
Kiad. English Universities Press, Budapest/Academic Press, London. 

Desmond, D. J. (1954). Quality control on the setting of voltage regulators. Appl. Stat., 
3, 9-15. 

Dilorio, F. C. (1991). SAS Applications Programming: A Gentle Introduction. 
PWS-KENT, Boston. 

Dilorio, F. C. and Hardy, K. A. (1996). Quick Start to Data Analysis with SAS. Duxbury 
Press, Belmont, California. 

Dijkstra, J. B. and Werter, S. P. J. (1981). Testing the equality of several means 
when the population variances are unequal. Commun. Stat., B: Simul. & Comp., 
10, 557-569. 

Dixon, W. J. (Ed.) (1992). BMDP Statistical Software Manual, Vols. 1,2, & 3. University 
of California Press, Los Angeles. 

Dobson, A. J. (1990). An Introduction to Generalized Linear Models. Chapman & Hall, 
London. 

Dodge, Y. (1985). Analysis of Experiments with Missing Data. John Wiley, New York. 

Dodge, Y. and Shah, K. R. (1977). Estimation of parameters in Latin squares and 
Graeco-Latin squares with missing observations. Commun. Stat., A: Theo. & Meth., 
6, 1465-1472. 

Donaldson, T. S. (1968). Robustness of the F-test to errors of both kinds and the 
correlation between the numerator and denominator of the F-ratio. J. Amer. Stat. 
Assoc., 63, 660-676 . 

Donner, A. (1986). A review of inference procedures for the intraclass correlation 
coefficient in the one-way random effects model. Inter. Stat. Rev., 54, 67-82. 
Donner, A. and Koval, J. J. (1989). The effect of imbalance on significance testing 
in one-way model II analysis of variance. Commun. Stat., A: Theo. & Meth., 18, 

1239-1250. 

Donner, A. and Wells, G. (1986). A comparison of confidence interval methods for the 
intraclass correlation coefficient. Biometrics, 42, 401-412. 

Donoghue, J. R. and Collins, L. M. (1990). A note on the unbiased estimation of the 
intraclass correlation. Psychom., 55, 159-164. 

Draper, N. R. and Hunter, W. G. (1969). Transformations: Some examples revisited. 
Technom., 11, 23—40. 

Draper, N. R. and Smith, H. (1981). Applied Regression Analysis, 2nd ed. John Wiley, 
New York. (3rd ed., 1998.) 


696 The Analysis of Variance 


Duncan, A. J. (1957). Charts of the 10% and 50% points of the operating characteristic 
curves for fixed effects analysis of variance F tests, a = .01 and .05. J. Amer. Stat. 
Assoc., 52, 345-349. 

Duncan, D. B. (1952). On the properties of multiple comparison test. Virg. J. Sct., 
3, 49-67. 

Duncan, D. B. (1955). Multiple range and multiple F'-tests. Biometrics, 11, 1-42. 

Dunlop, G. (1933). Methods of experimentation in animal nutrition. J. Agri. Sci., 23, 
580-614. 

Dunn, O. J. (1958). Estimation of the means of dependent variables. Ann. Math. Stat., 
29, 1095-1111. 

Dunn, O. J. (1959). Confidence intervals for the means of dependent, normally dis- 
tributed variables. J. Amer. Stat. Assoc., 54, 613-621. 

Dunn, O. J. (1961). Multiple comparisons among means. J. Amer. Stat. Assoc., 56, 
52-64. 

Dunn, O. J. and Clark, V. A. (1974, 1987). Applied Statistics: Analysis of Variance 
and Regression. \st & 2nd eds. John Wiley, New York. 

Dunn, O. J. and Massey, F. J. (1965). Estimation of multiple contrasts using ¢ distri- 
bution. J. Amer. Stat. Assoc., 60, 573-583. 

Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treat- 
ments with a control. J. Amer. Stat. Assoc., 50, 1096-1121. 

Dunnett, C. W. (1980a). Pairwise multiple comparisons in the homogeneous variance, 
unequal sample size case. J. Amer. Stat. Assoc., 75, 789-795. 

Dunnett, C. W. (1980b). Pairwise multiple comparisons in the unequal variance case. 
J. Amer. Stat. Assoc., 75, 796-800. 

Dunnett, C. W. (1982). Robust multiple comparisons. Commun. Stat., A: Theo. & Meth., 
11, 2611-2629. 

Dyer, D. D. and Keating, J. P. (1980). On the determination of critical values for 
Bartlett’s test. J. Amer. Stat. Assoc., 75, 313-319. 

Efron, B. (1982). Transformation theory: How normal is a family of distributions? Ann. 
Stat., 10, 323-339. 

Eisen, E. J. (1966). The quasi-F test for an unnested fixed factor in an unbalanced 
hierarchical design with a mixed model. Biometrics, 22, 937-942. 

Eisenhart, C. (1947a). The assumptions underlying the analysis of variance. Biometrics, 
3, 1-21. 

Eisenhart, C. (1947b). Inverse sine transformation of proportion. In: Selected Tech- 
niques of Statistical Analysis, Chapter 16, pp. 395-416 (Eds. C. Eisenhart, M. W. 
Hastay, and W. A. Wallis). McGraw-Hill, New York. 

Eisenhart, C. and Solomon, H. (1947). Significance of the largest of a set of sample 
estimates of variance. In: Selected Techniques of Statistical Analysis, Chapter 15, 
pp. 383-394 (Eds. C. Eisenhart, M. W. Hastay, and W. A. Wallis). McGraw-Hill, 
New York. 

Eisenhart, C., Hastay, M. W., and Wallis, W. A. (Eds.) (1947). Selected Techniques 
of Statistical Analysis. McGraw-Hill, New York. 

Elliott, R. J. (1995). Learning SAS in the Computer Lab. Duxbury Press, Belmont, 
California. 

Euler, L. (1782). Recherches sur une nouvelle espéce de quarrés magiques. Verh. Zeeu. 
Genoot. Wetens. Vlissen., 9, 85-239. 

Everitt, B. and Derr, G. (1996). A Handbook of Statistical Analyses Using SAS. Chap- 
man & Hall, London. 


References 697 


Federer, W. T. (1955). Experimental Design: Theory and Applications. MacMillan, 
New York. 

Federer, W. T. (1980). Some recent results in experimental design with a bibliography, 
I. Inter. Stat. Rev., 48, 337-368. 

Federer, W. T. (19814). Some recent results in experimental design with a bibliography, 
II: A-K. Inter. Stat. Rev., 49, 95-109. 

Federer, W. T. (1981b). Some recent results in experimental design with a bibliography, 
Ill: L-Z. Inter. Stat. Rev., 49, 185-197. 

Federer, W. T. and Balaam, L. N. (1972). Bibliography on Experiment and Treatment 
Design: Pre-1968. Oliver and Boyd, Edinburgh (for the International Statistical 
Institute). 

Federer, W. T. and Federer, A. J. (1973). A study of design publications: 1968 through 
1971. Amer. Stat., 27, 160-163. 

Federer, W. T. and Zelen, M. (1966). Analysis of multifactor classifications with 
unequal number of observations. Biometrics, 22, 525-552. 

Feldt, L. S. and Mahmoud, M. W. (1958a). Power function charts for specification of 
sample size in analysis of variance. Psychom., 23, 201-210. 

Feldt, L. S. and Mahmoud, M. W. (1958b). Power function charts for specifying 
numbers of observations in analysis of variance of fixed effects. Ann. Math. Stat., 
29 , 871-877. 

Fisher, L. and McDonald, J. (1978). Fixed Effects Analysis of Variance. Academic 
Press, New York. 

Fisher, R. A. (1918). The correlation between relatives on the supposition of Mendelian 
law on inheritance. Trans. R. Soc., Edin., 52, 399-433. 

Fisher, R. A. (1924). On a distribution yielding the error functions of several well-known 
statistics. Proc. Inter. Math. Cong. (Toronto), pp. 805-813. 

Fisher, R. A. (1925). Statistical Methods for Research Workers, \st ed. Oliver and Boyd, 
Edinburgh and London. 

Fisher, R. A. (1926). The arrangements of field experiments. J. Ministry Agri., 33, 
503-513. 

Fisher, R. A. (1932). Statistical Methods for Research Workers, 4th ed. Oliver and Boyd, 
Edinburgh. 

Fisher, R. A. (1935). The Design of Experiments, 1st ed. Oliver & Boyd, Edinburgh & 
London. (7th ed., 1960; 8th ed., 1966.) 

Fisher, R. A. (1958). Statistical Methods for Research Workers, 13th ed. Hafner, 
New York. (19th ed., 1997.) 

Fisher, R. A. and Mackenzie, W. A. (1923). Studies in crop variation, IT. The manurial 
response of different potato varieties. J. Agri. Sci., 13, 311-320. 

Fisher, R. A. and Yates, F. (1963). Statistical Tables for Biological, Agriculture and 
Medical Research, 6th ed. Hafner, New York. (4th ed., 1953; 5th ed., 1957.) 
Fleiss, J. L. (1986). The Design and Analysis of Clinical Experiments. John Wiley, 

New York. 

Fox, M. (1956) . Charts of the power of the F-test. Ann. Math. Stat., 27, 484-497. 

Freeman, M. F. and Tukey, J. W. (1950). Transformations related to the angular and 
the square root. Ann. Math. Stat., 21, 607-611. 

Freund, R. J. and Littell, R. C. (1991). SAS System for Regression, 2nd ed. SAS 
Institute, Cary, North Carolina. 

Friendly, M. (1991). SAS System for Statistical Graphics. SAS Institute, Cary, North 
Carolina. 


698 The Analysis of Variance 


Frude, N. (1993). A Guide to SPSS-PC-+. Springer-Verlag, New York. 

Gabriel, K. R. (1964). Procedure for testing the homogeneity of all sets of means in 
analysis of variance. Biometrics, 20, 459-477. 

Gallo, J. and Khuri, A. I. (1990). Exact tests for the random and fixed effects in an 
unbalanced mixed two-way cross-classification model. Biometrics, 46, 1087-1095. 

Games, P. A. (1977). An improved ¢ table for simultaneous control on g contrasts. 
J. Amer. Stat. Assoc., 72, 531-534. 

Games, P. A. and Howell, J. F. (1976). Pairwise multiple comparison procedures with 
unequal n’s and/or variances. J. Educ. Stat., 1, 113-125. 

Games, P. A., Winkler, H. B., and Probert, D. A. (1972). Robust tests for homogeneity 
of variance. Educ. Psychol. Meas., 32, 887-909. 

Ganguli, M. (1941). A note on nested sampling. Sankhya, 5, 449-4572. 

Gartside, P. S. (1972). A study of methods for comparing several variances. J. Amer. 
Stat. Assoc., 67, 342-346. 

Gates, C. E. and Shiue, C. (1962). The analysis of variance of the s-stage hierarchical 
classification. Biometrics, 25, 427-430. 

Gayen, A. K. (1950). The distribution of the variance ratio in random samples of any 
size drawn from non-normal universes. Biometrika, 37, 236-255. 

Gaylor, D. W. and Hartwell, T. D. (1969). Expected mean squares for nested classifi- 
cations. Biometrics, 25, 427—430. 

Gaylor, D. W. and Hopper, F. N. (1969). Estimating the degrees of freedom for linear 
combinations of mean squares by Satterthwaite’s formula. Technom., 11, 691-706. 

Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a (new) 
test of normality. Biometrika, 27, 310-332. 

Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation 
for normal samples. Biometrika, 28, 298-305. 

Geary, R. C. (1947). Testing for normality. Biometrika, 34, 209-242. 

Ghosh, M. N. and Sharma, D. (1963). Power of Tukey’s test for non-additivity. J. R. 
Stat. Soc., Ser. B, 25, 213-219. 

Gibbons, J. D. and Pratt, J. W. (1975). P-values: Interpretations and methodology. 
Amer. Stat., 29, 20-25. 

Gill, J. L. (1978). Design and Analysis of Experiments in the Animal and Medical 
Sciences, Vols. 1, 2, & 3. Iowa State University Press, Ames, Iowa. 

Glaser, R. E. (1976). Exact critical values for Bartlett’s test for homogeneity of vari- 
ances. J. Amer. Stat. Assoc., 71, 488-490. 

Glass, G. V., Peckham, P. D., and Sanders, J. R. (1972). Consequences of failure to 
meet assumptions underlying the fixed effects analysis of variance and covariance. 
Rev. Educ. Res., 42, 239-288. 

Glen, W. A. and Kramer, C. Y. (1958). Analysis of variance of a randomized block 
design with missing observations. Appl. Stat., 7, 173-185. 

Gosslee, D. G. and Lucas, H. L. (1965). Analysis of variance of disproportionate data 
when interaction is present. Biometrics, 21, 115-133. 

Gower, J. C. (1962). Variance component estimation for unbalanced hierarchical clas- 
sifications. Biometrics, 18, 427-430. 

Graybill, F. A. (1954). On quadratic estimation of variance components. Ann. Math. 
Stat., 25, 367-372. 

Graybill, F. A. (1961). An Introduction to Linear Statistical Models, Vol. 1. McGraw- 
Hill, New York. 

Graybill, F. A. (1976). Theory and Applications of Linear Models. Duxbury Press, 
North Scituate, Massachusetts. 


References 699 


Graybill, F. A. and Hultquist, R. A. (1961). Theorems concerning Eisenhart’s Model II. 
Ann. Math. Stat., 32, 261—269. 

Graybill, F. A. and Wortham, A. W. (1956). A note on uniformly best unbiased 
estimators for variance components. J. Amer. Stat. Assoc., 51, 266-268. 

Greenberg, B. G. (1951). Why randomize? Biometrics, 7, 309-322. 

Groggel, D. J., Wackerly, D. D., and Rao, P. V. (1988). Nonparametric estimation in 
one-way random effects models. Commun. Stat., B: Simul. & Comp., 17, 887-903. 

Guenther, W. C. (1964). Analysis of Variance. Prentice-Hall, Englewood Cliffs, 
New Jersey. 

Hahn, G. J. (1982). Design of experiments: An annotated bibliography. In: Encyclo- 
pedia of Statistical Sciences, Vol. 2, pp. 359-366 (Eds. S. Kotz and N. L. Johnson). 
John Wiley, New York. 

Hahn, G. J. and Hendrickson, R. W. (1971). A table of percentage points of the 
distribution of the largest absolute value of k Student ¢ variate and its applications. 
Biometrika, 58, 323-332. 

Hald, A. (1952). Statistical Tables and Formulas. John Wiley, New York. 

Hald, A. and Sinkbaek, S. A. (1950). A table of percentage points of the chi-square 
distribution. Skand. Aktuar., 33, 168-175. 

Hall, I. J. (1972). Some comparisons of tests for equality of variances. J. Stat. Comp. 
Simul., 1, 183-194. 

Halvorsen, K. T. (1991). Value splitting involving more factors. In: Fundamentals of 
Exploratory Analysis of Variance, pp. 114—145 (Eds. D.C. Hoaglin, F. Mosteller, 
and J. W. Tukey). John Wiley, New York. 

Hampel, F. R., Rochetti, E. M., Rousseeuw, P. J., and Stahel, W. A. (1986). Robust 
Statistics: The Approach Based on Influence Functions. John Wiley, New York. 

Hand, D. J., Daly, F., McConway, K., Lunn, D., and Ostrowski, E. (1993). A Hand- 
book of Small Data Sets. Chapman & Hall, London. 

Harsaae, E. (1969). On the computation and use of a table of percentage points of 
Bartlett’s M. Biometrika, 56, 273-281. 

Harter, H. L. (1960). Tables of range and Studentized range. Ann. Math. Stat., 31, 
1122-1147. 

Harter, H. L. (1961). Expected values of normal order statistics. Biometrika, 48, 151- 
165. 

Harter, H. L. (1964a). A new table of percentage points of the x? distribution. 
Biometrika, 51, 231-239. 

Harter, H. L. (1964b). New Tables of the Incomplete Gamma Function Ratio and 
of Percentage Points of the Chi-square and Beta Distributions. U.S. Government 
Printing Office, Washington, D.C. 

Harter, H. L. (1969a). Order Statistics and Their Use in Testing and Estimation, Vol. 1. 
Tests Based on Range and Studentized Range of Samples from a Normal Population. 
U.S. Government Printing Office, Washington, D.C. 

Harter, H. L. (1969b). Order Statistics and Their Use in Testing and Estimation, Vol. 2. 
Estimates Based on Order Statistics of Samples from Various Populations. U.S. 
Government Printing Office, Washington, D.C. 

Hartley, H. O. (1940). Testing the homogeneity of a set of variances. Biometrika, 31, 
249-255. 

Hartley, H. O. (1950). The maximum F-ratio as a short-cut test for heterogeneity of 
variance. Biometrika, 37, 308-312. 

Hartley, H. O. (1956). A plan for programming analysis of variance for general purpose 
computers. Biometrics, 12, 110-122. 


700 The Analysis of Variance 


Hartley, H. O. (1962). Analysis of variance. In: Mathematical Methods for Digital 
Computers, Vol. 1, pp. 221-230 (Eds. A. Ralston and H. S. Wilf). John Wiley, 
New York. 

Hartung, J. and Voet, B. (1987). An asymptotic x7-test for variance components. 
In: Contributions to Stochastics, pp. 153-163 (Ed. W. Sendler). Physica-Verlag, 
Heidelberg. 

Harville, D. A. (1969). Variance Components Estimation for the Unbalanced One- 
Way Random Classification—A Critique. Tech. Rep. No. ARL-69-0180, Aerospace 
Research Laboratories, Wright-Patterson Air Force Base, Dayton, Ohio. 

Harville, D. A. (1977). Maximum-likelihood approaches to variance component estt- 
mation and to related problems. J. Amer. Stat. Assoc., 72, 320-340. 

Harville, D. A. (1978). Alternative formulations and procedures for the two-way mixed 
model. Biometrics, 34, 441-454. 

Hatcher, L. and Stepanski, E. J. (1994). A Step-by-Step Approach to Using the 
SAS System for Univariate and Multivariate Statistics. SAS Institute, Cary, North 
Carolina. 

Hawkins, D. M. (1980). Identification of Outliers. Chapman & Hall, London. 

Hayman, G. E., Govindarajulu, Z., and Leone, F. C. (1973). Tables of the cumulative 
noncentral chi-square distribution. In: Selected Tables in Mathematical Statistics, 
Vol. 1, pp. 1-78 (Eds. H. L. Harter and D.B. Owen). American Mathematical 
Society, Providence, Rhode Island. 

Hayter, A. J. (1984). A proof of the conjecture that the Tukey-Kramer multiple com- 
parisons procedure is conservative. Ann. Stat., 12, 61-75. 

Healy, M. J. R. and Westmacott, M. (1956). Missing values in experiments analyzed 
on automatic computers. Appl. Stat., 5, 203-206. 

Hedayat, A. and Afsarinejad, K. (1975). Repeated measurements designs, I. In: A 
Survey of Statistical Design and Linear Models, pp. 229-242 (Ed. J. N. Srivastava). 
North-Holland, Amsterdam. 

Hedayat, A. and Afsarinejad, K. (1978). Repeated measurements design, II. Ann. 
Stat., 6, 619-628. 

Hedderson, J. (1991). SPSS-PC Plus Made Simple. Wadsworth, Belmont, California. 

Hedderson, J. and Fisher, M. (1993). SPSS-X Made Simple, 2nd ed. Wadsworth, 
Belmont, California. 

Hegemann, V. and Johnson, D. E. (1976). The power of two tests for additivity. J. 
Amer. Stat. Assoc., 71, 945-948. 

Heiberger, R. M. (1989). Computation for the Analysis of Designed Experiment. John 
Wiley, New York. 

Hemmerle, W. J. (1964). Algebraic specification of statistical models for analysis of 
variance computations. Assoc. Comput. Mach., 11, 234-239. 

Henderson, C. R. (1953). Estimation of variance and covariance components. Biomet- 
rics, 9, 226-252. 

Henderson, C. R. (1959). Design and analysis of animal husbandry experiments. 
In: Techniques and Procedures in Animal Production Research, Chapter 1, pp. 
2-56 (Ed. C. R. Henderson). American Society of Animal Production, Beltsville, 
Maryland. 

Henderson, C. R. (1969). Design and analysis of animal husbandry experiments. In: 
Techniques and Procedures in Animal Science Research, 2nd ed., Chapter 1, pp. 
1—35 (Ed. C. R. Henderson). American Society of Animal Science Monograph, 
Quality Corporation, Albany, New York. 


References 701 


Hendy, M. F. and Charles, J. A. (1970). The production techniques, silver content 
and circulation history of the twelfth century Byzantine Trachy. Archaeometry, 12, 
13-21. 

Herbach, L. H. (1959). Properties of Model II type analysis of variance tests, A: 
Optimum nature of the F-test for Model II in balanced case. Ann. Math. Stat., 
30, 939-959. 

Hernandez, R. P. and Burdick, R. K. (1993). Confidence intervals on the total variance 
in an unbalanced two-fold nested design. Biomet. J., 35, 515-522. 

Hernandez, R. P., Burdick, R. K., and Birch, N. J. (1992). Confidence intervals 
and tests of hypotheses on variance components in an unbalanced two-fold nested 
design. Biomet. J., 34, 387-402. 

Herr, D. G. and Gaebelin, J. (1978). Nonorthogonal two-way analysis of variance. 
Psychol. Bull., 85, 207-216. 

Herzberg, A. M. and Cox, D. R. (1959). Recent work in the design of experiments: A 
bibliography and a review. J. R. Stat. Soc., Ser. A, 132, 29-67. 

Herzberg, P. A. (1994). How SAS Works: A Comprehensive Introduction. Springer- 
Verlag, New York. 

Hicks, C. R. (1956). Fundamentals of analysis of variance, Part III. Indust. Qual. Contr. 
13 (4), 13-16. 

Hicks, C. R. (1987). Fundamental Concepts in the Design of Experiments, 3rd ed. Holt, 
Rinehart and Winston, New York. 

Hinkelmann, K. and Kempthorne, O. (1994). Design and Analysis of Experiments, 
Vol. I, Introduction to Experimental Design. John Wiley, New York. 

Hinkley, D. V., Reid, N., and Snell, E. J. (Eds.) (1991). Statistical Theory and Mod- 
elling. Chapman & Hall, London. 

Hirotsu, C. (1968). An approximate test for the case of random effects model in a 
two-way layout with unequal cell frequencies. Rep. Stat. Appl. Res. (JUSE), 15, 
13-26. 

Hirotsu, C. (1973). Multiple comparisons in two-way layout. Rep. Stat. Appl. Res. 
(JUSE), 20, 1-10. 

Hirotsu, C. (1983). An approach to defining the pattern of interaction effects in a two- 
way layout. Ann. Inst. Stat. Math. (Japan), Ser. A, 35, 77-90. 

Hoaglin, D. C. (1988). Transformations in everyday experience. Chance, 1 (4), 40—45. 

Hoaglin, D. C., Mosteller, F., and Tukey, J. W. (Eds.) (1983). Understanding Robust 
and Exploratory Data Analysis. John Wiley, New York. 

Hoaglin, D. C., Mosteller, F., and Tukey, J. W. (Eds.) (1991). Fundamentals of Ex- 
ploratory Analysis of Variance. John Wiley, New York. 

Hochberg, Y. (1974). Some conservative generalizations of the T-method in simulta- 
neous inference. J. Multivar. Anal., 4, 224—234. 

Hochberg, Y. (1976). A modification of the T-method of multiple comparisons for 
one-way layout with unequal variances. J. Amer. Stat. Assoc., 71, 200-203. 

Hochberg, Y. and Tamhane, A. C. (1983). Multiple comparisons in a mixed model. 
Amer. Stat., 37, 305—307. 

Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures. John 
Wiley, New York. 

Hocking, R. R. (1973). A discussion of the two-way mixed model. Amer. Stat., 27, 
148-152. 

Hocking, R. R. (1985). The Analysis of Linear Model. Brooks/Cole, Monterey, 
California. 


702 The Analysis of Variance 


Hocking, R. R. (1993). Variance component estimation in mixed linear models. In: Ap- 
plied Analysis of Variance in Behavioral Science, pp. 541-571 (Ed. L. K. Edwards). 
Marcel Dekker, New York. 

Hocking, R. R. (1996). The Analysis of Linear Models: Regression and Analysis of 
Variance. John Wiley, New York. 

Hodges, J. L., Jr. and Lehmann, E. L. (1970). Basic Concepts of Probability and 
Statistics, 2nd ed. Holden-Day, San Francisco. 

Hogg, R. V. and Craig, A. T. (1995). Introduction to Mathematical Statistics, 5th ed. 
Prentice-Hall, Englewood Cliffs, New Jersey. 

Hollander, M. and Wolfe, D. A. (1998). Nonparametric Statistical Methods, 2nd ed. 
John Wiley, New York. (1st ed. 1973.) 

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Stat., 
6, 65-70. 

Horsnell, G. (1953). The effect of unequal group variances on the F-test for the homo- 
geneity of group means. Biometrika, 40, 128-136. 

Howell, J. F. and Games, P. A. (1973). The effects of variance heterogeneity on s1- 
multaneous multiple-comparison procedures with equal sample size. Brit. J. Math. 
Stat. Psychol., 27, 72-81. 

Hoyle, M. H. (1971). Spoilt data — An introduction and a bibliography. J. R. Stat. Soc., 
Ser. A, 134, 429-439. 

Hoyle, M. H. (1973). Transformations — An introduction and a bibliography. Inter. 
Stat. Rev., 41, 203-223. 

Hsu, J. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall, London. 

Huber, P. J. (1981). Robust Statistics. John Wiley, New York. 

Huck, S. W. and Layne, B. H. (1974). Checking for proportional n’s in factorial 
ANOVA’s. Educ. Psychol. Meas., 34, 281-287. 

Hudson, J. D. and Krutchkoff, R. G. (1968). A Monte Carlo investigation of the size 
and power of tests employing Satterthwaite’s synthetic mean squares. Biometrika, 
55, 431-433. 

Huitema, B. E. (1980). The Analysis of Covariance and Alternatives. John Wiley, 
New York. 

Huitson, A. (1971). The Analysis of Variance. Griffin, London. 

Hussein, M. and Milliken, G. A. (1978a). An unbalanced two-way model with random 
effects having unequal variances. Biomet. J., 20, 203-213. 

Hussein, M. and Milliken, G. A. (1978b). An unbalanced nested model with random 
effects having unequal variances. Biomet. J., 20, 329-338. 

Imhof, J. P. (1958). Contributions to the Theory of Mixed Models in the Analysis of Vari- 
ance. Ph.D. Thesis, Department of Statistics, University of California, Berkeley, 
California. 

Imhof, J. P. (1960). A mixed model for the complete three-way layout with two random 
effects factors. Ann. Math. Stat., 31, 906—928. 

Jackson, R. W. B. (1939). Reliability of mental tests. Brit. J. Psychol., 29, 267-287. 

Jaffe, J. A. (1994). Mastering the SAS System, 2nd ed. Van Nostrand, Reinhold, 
New York. 

James, G. S. (1951). The comparison of several groups of observations when the ratios 
of population variances are unknown. Biometrika, 38, 324-329. 

Japanese Standards Association (1972). Selected Tables and Formulas with Computer 
Applications. Japanese Standards Association, Tokyo. 


References 703 


Jeyaratnam, S. and Graybill, F. Y. (1980). Confidence intervals on variance compo- 
nents in three-factor cross-classification models. Technom., 22, 375-380. 

John, J. A. and Quenouille, M. H. (1977). Experiments: Design and Analysis. Griffin, 
London. 

John, P. W. M. (1971). Statistical Design and Analysis of Experiments. McMillan, 
New York. (Reprinted 1998, SIAM, Philadelphia.) 

Johnson, D. E. and Graybill, F. A. (1972a). The estimation of o? in a two-way classi- 
fication with interaction. J. Amer. Stat. Assoc., 67, 388-394. 

Johnson, D. E. and Graybill, F. A. (1972b). An analysis of a two-way model with 
interaction and no replication. J. Amer. Stat. Assoc., 67, 862-868. 

Johnson, N. L. (1948). Alternative systems in the analysis of variance. Biometrika, 35, 
80-87. 

Johnson, N. L., Kotz, S., and Balakrishnan, N. (1995). Continuous Univariate Dis- 
tributions, Vol. 2, 2nd ed. John Wiley, New York. 

Johnson, N. L. and Leone, F. C. (1964, 1977). Statistics and Experimental Design 
in Engineering and the Physical Sciences, Vol. 2. 1st & 2nd eds. John Wiley, 
New York. 

Jones, B. and Kenward, M. G. (1989). Design and Analysis of Cross-Over Trials. 
Chapman & Hall, London. 

Kafadar, K. and Tukey, J. W. (1988). A bidect table. J. Amer. Stat. Assoc., 83, 532-539. 

Kastenbaum, M. A., Hoel, D. G., and Bowman, K. O. (1970a). Sample size require- 
ments: One-way analysis of variance. Biometrika, 57, 421-430. 

Kastenbaum, M. A., Hoel, D. G., and Bowman, K. O. (1970b). Sample size require- 
ments: Randomized block designs. Biometrika, 57, 573-577. 

Kempthorne, O. (1952). The Design and Analysis of Experiments. John Wiley, 
New York. 

Kempthorne, O. (1955). The randomization theory of experimental inference. J. Amer. 
Stat. Assoc., 50, 946-967. 

Kempthorne, O. (1976). The analysis of variance and factorial design. In: On the 
History of Statistics and Probability, pp. 29-54 (Ed. D. B. Owen). Marcel Dekker, 
New York. | 

Kempthorne, O. (1977). Why randomize? J. Stat. Plann. Inf., 1, 1-25. 

Kempthorne, O. and Folks, L. (1971). Probability, Statistics and Data Analysis. The 
Iowa State University Press, Ames, Iowa. 

Kendall, M. G. and Stuart, A. (1961). The Advanced Theory of Statistics, Vol. 2. 
Inference and Relationship, 3rd ed. Griffin, London. 

Kendall, M. G., Stuart, A., and Ord, J. K. (1983). The Advanced Theory of Statistics, 
Vol. 3. Design and Analysis, and Time-Series, 4th ed. MacMillan, New York. 
Keselman, H. J., Games, P. A., and Clinch, J. J. (1979). Tests for homogeneity of 

variance. Commun. Stat., B: Simul. & Comp., 88, 113-139. 

Keselman, H. J. and Rogan, J. C. (1978). A comparison of modified-Tukey and Scheffé 
methods of multiple comparisons for pairwise contrasts. J. Amer. Stat. Assoc., 73, 
47-51. 

Keselman, H. J., Toothaker, L. E., and Shooter, M. (1975). An evaluation of two 
unequal n; forms of the Tukey multiple comparison statistic. J. Amer. Stat. Assoc., 
70, 584-587. 

Keuls, M. (1952). The use of the “Studentized range” in connection with an analysis of 
variance. Euphytica, 1, 112-122. 


704 The Analysis of Variance 


Khargonkar, S. A. (1948). The estimation of missing plot values in split-plot and strip 
trials. J. Ind. Soc. Agri. Stat., 1, 147-161. 

Khuri, A. I. (1987). An exact test for the nesting effect’s variance component in an 
unbalanced two-fold nested model. Stat. Prob. Lett., 5, 305-311. 

Khuri, A. I. (1990). Exact tests for random models with unequal cell frequencies in the 
last stage. J. Stat. Plann. Inf., 24, 177-193. 

Khuri, A. I. (1995). A test to detect inadequacy of Satterthwaite’s approximation in 
balanced mixed models. Statistics, 27, 45-54. 

Khuri, A. I. and Cornell, J. (1996). Response Surfaces: Designs and Analyses, 2nd 
ed. Marcel Dekker, New York. 

Khuri, A. I. and Littell, R. C. (1987). Exact tests for the main effects variance compo- 
nents in an unbalanced random two-way model. Biometrics, 43, 545-560. 

Khuri, A. I., Mathew, T., and Sinha, B. K. (1998). Statistical Tests for Mixed Linear 
Models. John Wiley, New York. 

Khuri, A. I. and Sahai, H. (1985). Variance components analysis: A selective literature 
survey. Inter. Stat. Rev., 53, 279-300. 

Kihlberg, J. K., Herson, J. H., and Schutz, W. E. (1972). Square root transformation 
revisited. Appl. Stat., 21, 76-81. 

Kimball, A. W. (1951). On dependent tests of significance in the analysis of variance. 
Ann. Math. Stat., 22, 600-602. 

Kirk, R. E. (1995). Experimental Design: Procedures for the Behavioral Sciences, 3rd 
ed. Brooks/Cole, Belmont, California. (1st ed., 1968; 2nd ed., 1982.) 

Klotz, J. H. (1969). A simple proof of Scheffé’s multiple comparison theorem for 
contrasts in the one-way layout. Amer. Stat., 23, 44-45. 

Koch, G. G., Elashoff, J. D., and Amara, I. A. (1988). Repeated measurements — 
Design and Analysis. In: Encyclopedia of Statistical Sciences, Vol. 8, pp. 46—73 
(Eds. S. Kotz and N. L. Johnson). John Wiley, New York. 

Koehler, K. J. (1983). A simple approximation for the percentiles of the ¢ distribution. 
Technom., 25, 103-105. 

Koehler, K. J. and Larntz, K. (1980). An empirical investigation of goodness-of-fit 
statistics for sparse multinomials. J. Amer. Stat. Assoc., 75, 336-344. 

Kohr, R. L. and Games, P. A. (1974). Robustness of the analysis of variance, the Welch 
procedure, and a Box procedure to heterogeneous variances. J. Exper. Educ., 43, 
61-69. 

Kramer, C. Y. (1956). Extension of multiple range test to group means with unequal 
numbers of replications. Biometrics, 12, 307-310. 

Kramer, C. Y. (1957). Extension of multiple range tests to group correlated adjusted 
means. Biometrics, 13, 13-18. 

Kramer, C. Y. and Glass, S. (1960). Analysis of variance of a Latin square design with 
missing observations. Appl. Stat., 9, 43-50. 

Krishnaiah, P. R. (1979). Some developments on simultaneous test procedures: A 
review. In: Development in Statistics, Vol. 2, pp. 157-201 (Ed. P. R. Krishnaiah). 
Academic Press, New York. 

Krishnaiah, P. R. (Ed.) (1980). Handbook of Statistics, Vol. 1: Analysis of Variance. 
North-Holland, Amsterdam. 

Krishnaiah, P. R. and Yochmowitz, M. G. (1980). Inference on the structure of inter- 
action in two-way classification model. In: Handbook of Statistics, Vol. 1: Analysis 
of Variance, pp. 973-994 (Ed. P. R. Krishnaiah). North-Holland, Amsterdam. 


References 705 


Krutchkoff, R. G. (1988). One-way fixed effects analysis of variance when the error 
variances may be unequal. J. Stat. Comp. Simul., 30, 259-271. 

Krutchkoff, R. G. (1989). Two-way fixed effects analysis of variance when the error 
variances may be unequal. J. Stat. Comp. Simul., 32, 177~183. 

Kurtz, T. E., Link, R. F., Tukey, J. W., and Wallace, D. L. (1965). Short-cut mul- 
tiple comparisons for balanced single and double classifications, Part 1. Results. 
Technom., 7, 95-165. (Authors’ reply to Anscombe’s comment, Technom., 7, 169.) 

Kussmaul, K. and Anderson, R. L. (1967). Estimation of variance components in 
two-stage nested designs with composite samples. Technom., 9, 373-389. 

LaMotte, L. R. (1973). Quadratic estimation of variance components. Biometrics, 29, 
311-330. 

Larntz, K. (1978). Small sample comparison of exact levels of chi-squared 
goodness-of-fit statistics. J. Amer. Stat. Assoc., 73, 253-263. 

Laubscher, N. F. (1965). Interpolation in F tables. Amer. Stat., 19, 28 and 40. 

Layard, M. W. J. (1973). Robust large sample tests for homogeneity of variance. J. 
Amer. Stat. Assoc., 68, 195-198. 

Lee, P. M. (1997). Bayesian Statistics: An Introduction, 2nd ed. Arnold, London. 

Lehmann, E. L. (1975). Nonparametric Statistical Methods Based on Ranks. Holden- 
Day, San Francisco. 

Lehmer, E. (1944). Inverse tables of probabilities of errors of the second kind. Ann. 
Math. Stat., 15, 338-398. 

Lentner, M., Arnold, J., and Hinklemann, K. (1989). The efficiency of blocking: 
How to use MS(block)/MS(error) correctly. Amer. Stat., 43, 106-111. 

Levene, H. (1960). Robust tests for equality of variances. In: Contributions to Proba- 
bility and Statistics, pp. 278-292 (Eds. I. Olkin, S. G. Ghurye, W. Hoeffding, 
W. G. Madow, and H. B. Mann). Stanford University Press, Stanford, California. 

Levine, G. (1991). A Guide to SPSS for Analysis of Variance. Lawrence Erlbaum, 
Hillsdale, New Jersey. 

Levy, K. J. (1978a). An empirical comparison of the ANOVA F-test with alternatives 
which are more robust against heterogeneity of variance. J. Stat. Comp. Simul., 8, 
49-57. 

Levy, K. J. (1978b). An empirical study of the cube root test for homogeneity of 
variances with respect to the effects of a non-normality and power. J. Stat. Comp. 
Simul., 8, 71-78. 

Lindman, H. R. (1992). Analysis of Variance in Experimental Design. Springer-Verlag, 
New York. 

Lindsey, J. K. (1993). Models for Repeated Measurements. Clarendon Press, Oxford, 
U.K. 

Littell, R. C., Freund, R. J., and Spector, P. C. (1991). SAS Systems for Linear Models. 
SAS Institute, Cary, North Carolina. 

Littell, R. C., Milliken, G. A., Stroup, W. W., and Wolfinger, R. D. (1996). SAS 
System for Mixed Models. SAS Institute, Cary, North Carolina. 

Lix, L. M., Keselman, J. C., and Keselman, H. J. (1996). Consequences of assumption 
violations revisited: A quantitative review of alternatives to the one-way analysis 
of variance F test. Rev. Educ. Res., 66, 579-619. 

Lorenzen, T. J. (1977). Derivation of Expected Mean Squares and F-tests in Statistical 
Experimental Design. Res. Publ., GMR-2442, Mathematics Department, General 
Motors Research Laboratories, Warren, Michigan. 


706 The Analysis of Variance 


Lorenzen, T. J. (1984). Randomization and blocking in the design of experiments. 
Commun. Stat., A: Theo. & Meth., 13, 2601-2623. 

Lorenzen, T. J. (1987). A Comparison of Approximate F" Tests Under Pooling Rules. 
Res. Publ., GMR-5928, Mathematics Department, General Motors Research Lab- 
oratories, Warren, Michigan. 

Lorenzen, T. J. and Anderson, V. L. (1993). Design of Experiments: A No-Name 
Approach. Marcel Dekker, New York. 

Lurigio, A., Seng, M., Sinecore, J., and Dantzker, M. (1995). Computer Applications 
Using SPSS for Windows. Butterworth/Heinemann, Stoneham, Massachusetts. 

Mahalanobis, P. C. (1964). Professor Ronald Aylmer Fisher. Biometrics, 20, 238-251. 
(Reprinted from Sankhya, Vol. 4, 1938, pp. 265-272.) 

Mahamunulu, D. M. (1963). Sampling variances of the estimates of variance com- 
ponents in the unbalanced 3-way nested classification. Ann. Math. Stat., 34, 
521-527. 

Mandel, J. (1971). A new analysis of variance model for non-additive data. Technom., 
13, 1-18. 

Marcuse, S. (1949). Optimum allocation and variance components in nested sampling 
with an application to chemical analysis. Biometrics, 5, 189-206. 

Mardia, K. V. and Zemroch, P. J. (1978). Tables of the F and Related Distributions 
with Algorithms. Academic Press, New York. 

Maurais, J. and Quimet, R. (1986). Exact critical values of Bartlett’s test of homo- 
geneity of variances for unequal sample sizes for two populations and power of the 
test. Metrika, 33, 275-289. 

Maxwell, S. E. and Delaney, H. D. (1990). Designing Experiments and Analyzing 
Data: A Model Comparison Perspective. Wadsworth, Belmont, California. 

McCullagh, P. and Nelder, J. A. (1983, 1989). Generalized Linear Models, \st & 2nd 
eds. Chapman & Hall, London. 

McHugh, R. B. and Mielke, P. W., Jr. (1968). Negative variance estimates and statistical 
dependence in nested sampling. J. Amer. Stat. Assoc., 63, 1000-1003. 

McLean, R. A., Sanders, W. L., and Stroup, W. W. (1991). A unified approach to 
mixed linear models. Amer. Stat., 45, 54-63. 

Mead, R., Bancroft, T. A., and Han, C. P. (1975). Power of analysis of variance test 
procedures for incompletely specified mixed models. Ann. Stat., 3, 797-808. 
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. 

Psychol. Bull., 105, 156-166. 

Miller, J. J. (1977). Asymptotic properties of maximum likelihood estimates in the 
mixed model of the analysis of variance. Ann. Stat., 5, 746-762. 

Miller, R. G., Jr. (1966). Simultaneous Statistical Inference. McGraw-Hill, New York. 

Miller, R. G., Jr. (1968). Jackknifing variances. Ann. Math. Stat., 39, 567-582. 

Miller, R. G., Jr. (1977). Developments in multiple comparisons, 1966-1976. J. Amer. 
Stat. Assoc., 72, 779-788 . 

Miller, R. G., Jr. (1981). Simultaneous Statistical Inference, 2nd ed. Springer-Verlag, 
New York. 

Miller, R. G., Jr. (1985). Multiple comparisons. In: Encyclopedia of Statistical Sciences, 
Vol. 5, pp. 679-689 (Eds. S. Kotz and N. L. Johnson). John Wiley, New York. 
Miller, R. G., Jr. (1986). Beyond ANOVA: Basics of Applied Statistics. John Wiley, New 

York. (Reprinted 1996, Chapman & Hall, New York.) 

Milliken, G. A. and Graybill, F. A. (1970). Extensions of the general linear hypothesis 

model. J. Amer. Stat. Assoc., 65, 797-807. 


References 707 


Milliken, G. A. and Graybill, F. A. (1971). Tests for interaction in the two-way model 
with missing data. Biometrics, 27, 1079-1083. 

Milliken, G. A. and Graybill, F. A. (1972). Interaction models for Latin square. Aust. 
J. Stat., 14, 129-138. 

Milliken, G. A. and Johnson, D. E. (1992). Analysis of Messy Data, Vol. 1. Chapman 
& Hall, London. 

Millman, J. and Glass, G. V. (1967). Rules of thumb for writing the ANOVA table. 
J. Educ. Meas., 4, 41-51. 

Miron, T. (1993). SAS Software Solutions: Basic Data Processing. SAS Institute, Cary, 
North Carolina. 

Miyakawa, M. (1993). An interpretation of the interaction terms in Mandel’s ANOVA 
Model from Hirotsu’s interaction elements. Rep. Stat. Appl. Res. (JUSE), 20, 1—10. 

Montgomery, D. C. (1991). Design and Analysis of Experiments, 3rd ed. John Wiley, 
New York. (1st ed., 1976; 2nd ed., 1984). 

Morrison, D. F. (1990). Multivariate Statistical Methods, 3rd ed. McGraw-Hill, 
New York. (1st ed., 1967; 2nd ed., 1976.) 

Moses, L. E. (1978). Charts for finding upper percentage points of Student’s ¢ in the 
range .01 to .00001. Commun. Stat., B: Simul. & Comp., 7, 479-490. 

Mosteller, F. and Tukey, J. W. (1977). Data Analysis and Regression. Addision- Wesley, 
Reading, Massachusetts. 

Murdock, G. R. and Williford, W. O. (1977). Tables for obtaining optimal confidence 
intervals involving the chi-square distribution. In: Selected Tables in Mathematical 
Statistics, Vol. 5, pp. 205-230 (Eds. D. B. Owen and R. E. Odeh). American 
Mathematical Society, Providence, Rhode Island. 

Myers, R. H. (1976). Response Surface Methodology. Allyn and Bacon, Boston. 

Myers, R. H. and Howe, R. B. (1971). On alternative approximate F tests for hypotheses 
involving variance components. Biometrika, 58, 393-396. 

Myers, R. H. and Montgomery, D. C. (1995). Response Surface Methodology. John 
Wiley, New York. 

Nagasenkar, P. B. (1984). On Bartlett’s test for homogeneity of variances. Biometrika, 
71, 405-407. 

Naik, U. D. (1974). On tests of main effects and interactions in higher-way layouts in 
the analysis of variance random effects model. Technom., 16, 17-25. 

Nair, K. R. (1940). The application of the technique of analysis of covariance to field 
experiments with several missing or mixed-up plots. Sankhya, 4, 581-588. 

Nair, K. R. (1948). Distribution of the extreme deviate from the sample mean. 
Biometrika, 35, 118-144. 

Natrella, M. G. (1963). Experimental Statistics. John Wiley, New York. (Reprint of the 
original edition published by the National Bureau of Standards, Washington, D.C. 
as Handbook No. 91.) 

Nelson, L. S. (1983). A comparison of sample sizes for the analysis of means and the 
analysis of variance. J. Qual. Tech., 15, 33-39. 

Nelson, L. S. (1985). Sample size tables for analysis of variance. J. Qual. Tech., 17, 
167-169. 

Neter, J.. Kutner, M. H., Nachtsheim, C. J., and Wasserman, W. (1996). Applied 
Linear Statistical Models. 4th ed. Irwin, Burr Ridge, Illinois. 

Neter, J.. Wasserman, W., and Kutner, M. H. (1990). Applied Linear Statistical 
Models: Regression, Analysis of Variance, and Experimental Designs, 3rd ed. 
Irwin, Burr Ridge, Illinois. (1st ed. 1974; 2nd ed 1985). 


708 The Analysis of Variance 


Newman, D. (1939). The distribution of the range in samples from a normal population, 
expressed in terms of an independent estimate of standard deviation. Biometrika, 
31, 20-30. 

Norton, H. W. (1939). The 7 x 7 squares. Ann. Eug., 9, 269-307. 

O’Brien, R. G. (1979). Animproved ANOVA method for robust tests of additive models 
for variances. J. Amer. Stat. Assoc., 74, 877-880. 

O’Brien, R. G. (1981). A simple test for variance effects in experimental designs. 
Psychol. Bull., 89, 570-574. 

O’Neil, R. and Wetherill, G. B. (1971). The present state of multiple comparison 
methods. J. R. Stat. Soc., Ser. B, 33, 218-250. 

Olejnik, S. F. and Algina, J. (1987). Type I error rates and power estimates of selected 
parametric and nonparametric tests of scales. J. Educ. Stat., 12, 45-61. 

Olkin, I. and Pratt, J. W. (1958). Unbiased estimation of certain correlation 
coefficients. Ann. Math. Stat., 29, 201-211. 

Ostle, B. (1952). Answer to query no. 95. Biometrics, 8, 264-266. 

Ostle, B. and Malone, L. C. (1988). Statistics in Research: Basic Concepts and Tech- 
niques for Research Workers, 4th ed. lowa State University Press, Ames, Iowa. 

Ostle, B. and Mensing, R. W. (1975). Statistics in Research: Basic Concepts and 
Techniques For Research Workers, 3rd ed. lowa State University Press, Ames, 
Iowa. 

Owen, D. B. (1962). Handbook of Statistical Tables. Addison-Wesley, Reading, Mas- 
sachusetts. 

Owen, D. B. (1968). A survey of properties and applications of the noncentral ¢ distri- 
bution. Jechnom., 10, 445-478. 

Owen, D. B. (1985). Noncentral ¢ distribution. In: Encyclopedia of Statistical Sciences, 
Vol. 6, pp. 286-290 (Eds. S. Kotz and N. L. Johnson). John Wiley, New York. 

Parker, E. T. (1959). Orthogonal Latin squares. Proc. Nat. Acad. Sci., 45, 459-462. 

Paull, A. E. (1950). On a preliminary test for pooling mean squares in the analysis of 
variance. Ann. Math. Stat., 21, 539-556. 

Pearson, E. S. (1931). The analysis of variance in cases of non-normal variation. 
Biometrika, 23, 114-133. 

Pearson, E. S. and Hartley, H. O. (1951). Charts of the power function for analysis of 
variance tests, derived from the non-central F-distribution. Biometrika, 38, 112- 
130. 

Pearson, E. S. and Hartley, H. O. (1970). Biometrika Tables for Statisticians, Vol. 1, 
3rd ed. Cambridge University Press, Cambridge, U.K. 

Pearson, E. S. and Hartley, H. O. (1973). Biometrika Tables for Statisticians, Vol. II, 
3rd ed. Cambridge University Press, Cambridge, U.K. 

Pearson, E. S., D’Agostino, R. B., and Bowman, K. O. (1977). Tests for departure 
from normality: Comparison of powers. Biometrika, 64, 231-246. 

Peng, K. C. (1967). The Design and Analysis of Scientific Experiments. Addison Wesley, 
Reading, Massachusetts. 

Perry, J. N., Wall, C., and Greenway, A. R. (1980). Latin square designs in field 
experiments involving sex attractants. Ecol. Entomol., 5, 385-396. 

Petrinovich, L. F. and Hardyck, C. D. (1969). Error rates for multiple comparison 
methods. Psychol. Bull., 71, 43-54. 

Pillai, K. C. S. and Ramachandran, K. V. (1954). On the distribution of the ratio of the 
i-th observation in an ordered sample from a normal population to an independent 
estimate of the standard deviation. Ann. Math. Stat., 25, 565-572. 


References 709 


Pitman, E. J. G. (1938). Significance tests which may be applied to samples from any 
population, III. The analysis of variance test. Biometrika, 29, 322-335. 

Plackett, R. L. (1960). Models in analysis of variance (with discussion). J. R. Stat. Soc., 
Ser. B, 22, 195-217. 

Pratt, J. W. and Gibbons, J. D. (1981). Concepts of Nonparametric Theory. Springer- 
Verlag, New York. 

Pukelsheim, F. (1981). On the existence of unbiased nonnegative estimates of variance 
and covariance components. Ann. Stat., 9, 293-299. 

Ramsey, P. H. (1994). Testing variances in psychological and educational research. 
J. Educ. Stat., 19, 23-42. 

Ramsey, P. H. and Brailsford, E. A. (1989). Robustness and power of tests of variability 
on two independent groups. J. Math. Stat. Psychol., 43, 113-130. 

Rankin, N. O. (1974). The harmonic mean method for one-way and two-way analysis 
of variance. Biometrika, 61, 117-122. 

Rao, C. R. (1971). Estimation of variance and covariance components — MINQUE 
theory. J. Multivar. Anal., 1, 257-275. 

Rao, C. R. (1972). Estimation of variance and covariance components in linear models. 
J. Amer. Stat. Assoc., 67, 112-115. 

Rao, C. R. (1973). Linear Statistical Inference and Its Applications, 2nd. ed. John Wiley, 
New York. (1st ed., 1965.) 

Rao, C. R. and Kleffe, J. (1988). Estimation of Variance Components and Applications. 
North-Holland, Amsterdam. 

Rao, C. R. and Toutenburg, H. (1995). Linear Models: Least Squares and Alternatives. 
Springer-Verlag, New York. 

Rao, P. S. R. S. (1997). Variance Components Estimation: Mixed Models, Methodolo- 
gies and Applications. Chapman & Hall, London. 

Ratkowski, D. A., Evans, M. A., and Alldredge, J. R. (1993). Cross-Over Experiments: 
Design, Analysis and Application. Marcel Dekker, New York. 

Resnikoff, G. J. and Lieberman, G. J. (1957). Tables of the Noncentral t Distribution. 
Stanford University Press, Stanford, California. 

Ringland, J. T. (1983). Robust multiple comparisons. J. Amer. Stat. Assoc., 78, 145-151. 

Robertson, A. (1962). Weighting in the estimation of variance components in the un- 
balanced single classification. Biometrics, 18, 413-417. 

Rogan, J. C. and Keselman, H. J. (1977). Is the ANOVA F-test robust to variance 
heterogeneity when sample sizes are equal? : An investigation via a coefficient of 
variation. Amer. Educ. Res. J., 14, 493-498. 

Rosenthal, R. and Rosnovw, R. L. (1985). Contrast Analysis: Focused Comparison in 
the Analysis of Variance. Cambridge University Press, Cambridge, U.K. 

Royston, J. P. (1982a). An extension of Shapiro and Wilk’s W test for normality to large 
samples. Appl. Stat., 31, 115-124. 

Royston, J. P. (1982b). The W test for normality. Appl. Stat., 31, 176-180. 

Royston, J. P. (1983). A simple method for evaluating the Shapiro-Francia W’ test for 
non-normality. Statistician, 32, 297-300. 

Royston, J. P. (1991). Estimating departures from normality. Stat. Med., 10, 1283- 
1291. 

Royston, J. P. (1993a). Graphical detection of non-normality by using Michael’s statis- . 
tic. Appl. Stat., 42, 153-158. 

Royston, J. P. (1993b). A toolkit for non-normality in complete and censored samples. 
Statistician, 42, 37-43. 


710 The Analysis of Variance 


Royston, J. P. (1993c). A pocket calculator algorithm for the Shapiro-Francia test for 
non-normality: An application to medicine. Stat. Med., 12, 181-184. 

Royston, J. P., Flecknell, P. A., and Wootton, R. (1982). New evidence that the intra- 
uterine growth retarded piglet is a member of a discrete subpopulation. Biol. Neon., 
42, 100-104. 

Rubin, D. B. (1972). A non-iterative algorithm for least squares estimation of missing 
values in any analysis of variance design. Appl. Stat., 21, 136-141. 

Sahai, H. (1974a). On negative estimates of variance components under finite population 
models. S. Afric. Stat. J., 8, 157-166. 

Sahai, H. (1974b) Non-negative maximum likelihood and restricted maximum likeli- 
hood estimators of variances components in two simple linear models. Util. Math., 
5, 151-160. 

Sahai, H. (1976). A comparison of estimators of variance components in the balanced 
three-stage nested random effects model using mean squared error criterion. J. Amer. 
Stat. Assoc., 71, 435-444. 

Sahai, H. (1979). A bibliography on variance components. Inter. Stat. Rev., 47, 177-222. 

Sahai, H. (1988). Two-way mixed model: A brief review. N. Zealand Statist.,23, 58-65. 

Sahai, H., Khuri, A. I., and Kapadia, C. H. (1985). A second bibliography on variance 
components. Commun. Stat., A: Theo. & Meth., 14, 63-115. 

Sahai, H. and Khurshid, A. (1992). A comparison of estimators of variance components 
in a two-way balanced crossed classification random effects model. Statistics, 23, 
128-143. 

Sahai, H. and Thompson, W. O. (1973). Non-negative maximum likelihood estimators 
of variance components in a simple linear model. Amer. Stat., 27, 112-113. 

Samuels, M. L., Casella, G., and McCabe, G. P. (1991). Interpreting blocks and 
random factors. J. Amer. Stat. Assoc., 86, 798-808. 

SAS Institute (1989). SAS Language and Procedures: Usage. Version 6.0. SAS Insti- 
tute, Cary, North Carolina. 

SAS Institute (1990a). SAS Language: Reference. Version 6.0. SAS Institute, Cary, 
North Carolina. 

SAS Institute (1990b). SAS Procedures Guide. Version 6.0, 3rd ed. SAS Institute, Cary, 
North Carolina. 

SAS Institute (1990c). SAS/STAT User’s Guide. Version 6.0, Vols. I & II, 4th ed. SAS 
Institute, Cary, North Carolina. 

SAS Institute (1991). SAS Language and Procedures: Usage 2. Version 6.0. SAS 
Institute, Cary, North Carolina. 

SAS Institute (1992). SAS Introductory Guide for PC’s. Version 6.03. SAS Institute, 
Cary, North Carolina. 

SAS Institute (1997). SAS/STAT Software: Changes and Enhancements through Re- 
lease 6.12. SAS Institute, Cary, North Carolina. 

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance com- 
ponents. Biom. Bull., 2, 110-114. 

Scheffé, H. (1953). A method for judging all contrasts in the analysis of variance. 
Biometrika, 40, 87-104. 

Scheffé, H. (1956a). A mixed model” for the analysis of variance. Ann. Math. Stat., 
27, 23-36. 

Scheffé, H. (1956b). Alternative models for the analysis of variance. Ann. Math. Stat., 
27, 251-271. 


References 711 


Scheffé, H. (1959). The Analysis of Variance. John Wiley, New York. 

Schervish, M. L. (1992). Bayesian analysis of lineal models. In: Bayesian Statistics, IV, 
pp. 419-434 (Eds. J. M. Bernardo, J. V. Berger, A. P. David, and A. F. M. Smith). 
Oxford University Press, New York. 

Schlotzhauer, S. D. and Littell, R. C. (1997). SAS System for Elementary Statistical 
Analysis, 2nd ed. SAS Institute, Cary, North Carolina. (1st ed., 1987.) 

Schultz, E. F., Jr. (1954). Answer to query no. 110. Biometrics, 10, 407-411. 

Schultz, E. F.,, Jr. (1955). Rules of thumb for determining expectations of mean squares 
in analysis of variance. Biometrics, 11, 123-148. 

Searle, S. R. (1961). Variance components in the unbalanced two-way nested classifi- 
cation. Ann. Math. Stat., 32, 1161-1166. 

Searle, S. R. (1968). Another look at Henderson’s methods of estimating variance 
components. Biometrics, 24, 749-778. 

Searle, S. R. (1971a). Topics in variance components estimation. Biometrics, 27, 1-76. 

Searle, S. R. (1971b). Linear Models. John Wiley, New York. (Wiley Classic Edition, 
1997.) 

Searle, S. R. (1987). Linear Models for Unbalanced Data. John Wiley, New York. 

Searle, S. R. (1988). Mixed models and unbalanced data: Wherefrom, whereat, and 
whereto? Commun. Stat., A: Theo. & Meth., 17, 935-968. 

Searle, S. R. (1995). An overview of variance component estimation. Metrika, 42, 
215-230. 

Searle, S. R., Casella, G., and McCulloch, C. E. (1992). Variance Components. John 
Wiley, New York. 

Searle, S. R. and Fawcett, R. F. (1970). Expected mean squares in variance components 
models having finite populations. Biometrics, 26, 243-254. 

Seely, J. F. and El-Bassiouni, Y. (1983). Applying Wald’s variance component test. 
Ann. Stat., 11, 197-201. 

Seifert, B. (1981). Explicit formulae of exact tests in mixed balanced ANOVA models. 
Biomet. J., 23, 535-550. 

Sen, B., Graybill, F. A., and Ting, N. (1992). Confidence intervals on ratios of variance 
components for the unbalanced two-factor nested model. Biomet. J., 34, 259-274. 

Senn, S. (1993). Cross-Over Trials in Clinical Research. John Wiley, Chichester, U.K. 

Shaffer, J. P. (1977). Multiple comparison emphasizing selected contrasts: An extension 
and generalization of Dunnett’s procedure. Biometrics, 33, 293-303. 

Shaffer, J. P. (1986). Modified sequentially rejective multiple test procedures. J. Amer. 
Stat. Assoc., 81, 826-831. 

Shapiro, S. S. and Francia, R. S. (1972). An approximate analysis of variance test for 
normality. J. Amer. Stat. Assoc., 67, 215-216. 

Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for normality 
(complete samples). Biometrika, 52, 591-611. 

Shapiro, S. S., Wilk, M. B., and Chen, H. J. (1968). A comparative study of various 
tests for normality. J. Amer. Stat. Assoc., 63, 1343-1372. 

Sidak, A. (1967). Rectangular confidence regions for the means of multivariate normal 
distributions. J. Amer. Stat. Assoc., 62, 626-633. 

Singh, B. (1987). On the non-null distribution of ANOVA F-ratio in one-way unbalanced 
random model. Calcutta Stat. Assoc. Bull., 36, 57-62. 

Singhal, R. A. (1987). Confidence limits on heritability under nonnormal variations. 
Biomet. J., 29, 571-578. 


712 The Analysis of Variance 


Singhal, R. A. and Sahai, H. (1992). Sampling distribution of the ANOVA estimator 
of between variance component in samples from a non-normal universe. J. Stat. 
Comp. Simul., 43, 19-30. 

Singhal, R. A. and Sahai, H. (1994). Effects of non-normality on the power function 
in a one-way random model. Stat. Papers, 35, 113-125. 

Singhal, R. A. and Singh, C. (1984). Distribution of the variance ratio test in a non- 
normal random effects model. Sankhya, Ser. B, 46, 29-35. 

Singhal, R. A., Tiwari, C. B., and Sahai, H. (1988). A selected and annotated biblio- 
graphy on the robustness studies to non-normality in variance components models. 
J. Jap. Stat. Soc., 18, 195-206. 

Smith, D. W. and Murray, L. W. (1984). An alternative to Eisenhart’s Model II and 
mixed model in the case of negative variance estimates. J. Amer. Stat. Assoc., 79, 
145-151. 

Smith, H. F. (1951). Analysis of variance with unequal but proportional numbers of 
observations in the sub-classes of a two-way classification. Biometrics, 7, 70—74. 

Snedecor, G. W. (1934). Analysis of Variance and Covariance. Iowa State University 
Press, Ames, Iowa. 

Snedecor, G. W. (1955). Query no. 113. Biometrics, 11, 111-113. 

Snedecor, G. W. and Cochran, W. G. (1967, 1989). Statistical Methods, 6th & 8th eds. 
Iowa State University Press, Ames, Iowa. 

Snee, R. D. (1985). Graphical display of results of three-treatment randomized block 
experiments. Appl. Stat., 34, 71-77. 

Snell, E. J. (1987). Applied Statistics: A Handbook of BMDP Analyses. Chapman & 
Hall, London. 

Sokal, R. R. and Rohlf, F. J. (1995). Biometry, 3rd ed. W. H. Freeman, New York. (ist 
ed. 1969; 2nd ed. 1981.) 

Spector, P. E. (1993). SAS Programming for Researchers and Social Scientists. Sage, 
Thousand Oaks, California. 

Spjotvoll, E. (1967). Optimum invariant tests in unbalanced variance components mod- 
els. Ann. Math. Stat., 38, 422-428. 

Spjotvoll, E. (1968). Confidence intervals and tests for variance ratios in unbalanced 
variance components models. Inter. Stat. Rev., 36, 37-42. 

Spjotvoll, E. and Stoline, M. R. (1973). An extension of the T-method of multiple 
comparisons to include the cases with unequal sample sizes. J. Amer. Stat. Assoc., 
68, 975-978. 

Sprent, P. (1997). Applied Nonparametric Statistical Methods, 2nd ed. Chapman & 
Hall, London. (1st ed., 1989.) 

SPSS, Inc. (1997a). SPSS Base 7.5 for Windows User’s Guide. SPSS, Inc., Chicago, 
Illinois. 

SPSS, Inc. (1997b). SPSS Professional Statistics 7.5. SPSS, Inc., Chicago, Ilonios. 

SPSS, Inc. (1997c). SPSS Advanced Statistics 7.5. SPSS, Inc., Chicago, Illonois. 

Srivastava, A. B. L. (1959). Effects of non-normality on the power of the analysis of 
variance test. Biometrika, 46, 114-122. 

Srivastava, S. R. and Bozivich, H. (1962). Power of certain analysis of variance test 
procedures involving preliminary test. Bull. Inter. Stat. Inst., 39, 133. 

Steel, G. D. and Torrie, J. H. (1980). Principles and Procedures of Statistics: A Bio- 
metrical Approach, 2nd ed. McGraw-Hill, New York. (ist ed., 1960.) 

Steel, G. D., Torrie, J. H., and Dickey, D. A. (1997). Principles and Procedure of 
Statistics: A Biometrical Approach, 3rd ed. McGraw-Hill, New York. 


References 713 


Stoline, M. R. (1978). Tables of the Studentized augmented range and applications to 
problems of multiple comparison. J. Amer. Stat. Assoc., 73, 656-660. 

Stoline, M. R. (1981). The status of multiple comparisons: Simultaneous estimation 
of all pairwise comparisons in one-way ANOVA designs. Amer. Stat., 35, 134- 
141. 

Stoline, M. R. and Ury, H. K. (1979). Tables of the Studentized distribution and an 
application to multiple comparisons among means. Technom., 21, 87-93. 

Street, A. P. and Street, D. J. (1987). Combinatorics of Experimental Design. Clarendon 
Press, Oxford. 

Street, A. P. and Street, D. J. (1988). Latin squares and agriculture: The other bicen- 
tennial. Math. Scient., 13, 48—55. 

Stroup, W. W. (1989). Why mixed models? Applications of mixed models in agriculture 
and related disciplines. South. Coop. Ser. Bull. No. 343, 183-201. 

Szatrowski, T. H. and Miller, J. J. (1980). Explicit maximum likelihood estimates 
for balanced data in the mixed model of the analysis of variance. Ann. Stat., 8, 
811-819. 

Tabachnick, B. G. and Fidell, L. S. (1991). Software for advanced ANOVA courses: 
A survey. Beh. Res. Meth., Instrum. & Comp., 23, 208-211. 

Tamhane, A. C. (1979). A comparison of procedures for multiple comparisons of means 
with unequal variances. J. Amer. Stat. Assoc., 74, 471—480. 

Tan, W. Y. (1981). The power function and an approximation for testing variance com- 
ponents in the presence of interaction in two-way random effects models. Canad. 
J. Stat., 9, 91-99. 

Tan, W. Y. and Cheng, S. S. (1984). On testing variance components in three-stages 
unbalanced nested random effects models. Sankhya, Ser. B, 46, 188-200. 

Tan, W. Y. and Tabatabai, M. A. (1986). Some Monte Carlo studies on the comparison 
of several means under heteroscedasticity and robustness with respect to departure 
from normality. Biomet. J., 28, 801-814. 

Tan, W. Y., Tabatabai, M. A., and Balakrishnan, N. (1988). Harmonic mean approach 
to unbalanced random effects model under heteroscedasticity. Commun. Stat., A: 
Theo. & Meth., 17, 1261-1286. 

Tan, W. Y. and Wong, S. P. (1980). On approximating the null and non-null distributions 
of the F ratio in unbalanced random effects models from non-normal universes. J. 
Amer. Stat. Assoc., 75, 655-662. 

Tang, P. C. (1938). The power function of the analysis of variance tests with tables and 
illustrations for their use. Stat. Res. Mem., 2, 126-149. 

Tate, R. F. and Klett, G. W. (1959). Optimal confidence intervals for the variance of a 
normal distribution. J. Amer. Stat. Assoc., 54, 674—682. 

Theune, J. A. (1973). Comparison of power for the D’ Agostino and the Wilk-Shapiro 
tests of normality for small and moderate samples. Stat. Neerland. , 27, 163-169. 

Thomas, J. D. and Hultquist, R. A. (1978). Interval estimation for the unbalanced case 
of the one-way random effects model. Ann. Stat., 6, 582-587. 

Thomsen, I. B. (1975). Testing hypotheses in unbalanced variance components models 
for two-way layouts. Ann. Stat., 3, 257-265. 

Thoni, H. (1967). Transformation of Variables Used in the Analysis of Experimental 
and Observational Data: A Review. Tech. Rep. No. 7, Statistical Laboratory, lowa 
State University, Ames, Iowa. 

Tietjen, G. L. (1974). Exact and approximate tests for unbalanced random effects 
designs. Biometrics, 30, 573-581. 


714 The Analysis of Variance 


Tiku, M. L. (1964). Approximating the general non-normal variance-ratio sampling 
distributions. Biometrika, 51, 83-95. 

Tiku, M. L. (1967). Tables of the power of the F-test. J. Amer. Stat. Assoc., 62, 529-539. 
(Corrigenda 63, 1551.) 

Tiku, M. L. (1971). Power function of F-test under non-normal situations. J. Amer. 
Stat. Assoc., 66, 913-916. 

Tiku, M. L. (1972). More tables of the power of the F-test. J. Amer. Stat. Assoc., 67, 
709-710. 

Tiku, M. L. (1974). Doubly noncentral F distribution — Tables and applications. In: 
Selected Tables in Mathematical Statistics, Vol. 2, pp. 139-178 (Eds. H. L. Harter 
and D. B. Owen). American Mathematical Society, Providence, Rhode Island. 

Tiku, M. L. (1985a). Noncentral chi-square distribution. In: Encyclopedia of Statistical 
Sciences, Vol. 6, pp. 276—280 (Eds. S. Kotz and N. L. Johnson). John Wiley, 
New York. | 

Tiku, M. L. (1985b). Noncentral F-distribution. In: Encyclopedia of Statistical Sci- 
ences, Vol. 6, pp. 2830—284 (Eds. S. Kotz and N. L. Johnson). John Wiley, New York. 

Ting, N., Burdick, R. K., Graybill, F. A., Jeyaratnam, S., and Lu, T. F. C. (1990). 
Confidence intervals on linear combinations of variance components that are un- 
restricted in sign. J. Stat. Comp. Simul., 35, 135-143. 

Tippett, L. H. C. (1931). The Methods of Statistics, 1st ed. William and Norgate, 
London. (4th ed., John Wiley, New York.) 

Tippett, L. H. C. (1934). Applications of Statistical Methods to the Control of Quality 
in Industrial Production. Manchester Statistical Society, Manchester, U.K. 

Tocher, K. D. (1952). The design and analysis of block experiments. J. R. Stat. Soc., 
Ser. B, 14, 45-100. 

Toothaker, L. E. (1991). Multiple Comparison for Researchers. Sage, Thousand Oaks, 
California. 

Tukey, J. W. (1949a). Dyadic ANOVA, an analysis of variance for vectors. Hum. Biol., 
21, 65-110. 

Tukey, J. W. (1949b). One degree of freedom for nonadditivity. Biometrics, 5, 232-242. 

Tukey, J. W. (1949c). Interaction ina Row-by-Column Design. Mem. Rep. 18, Statistical 
Research Group, Princeton University, Princeton, New Jersey. 

Tukey, J. W. (1950). Finite Sampling Simplified. Mem. Rep. 45, Statistical Research 
Group, Princeton University, Princeton, New Jersey. 

Tukey, J. W. (1953). The Problem of Multiple Comparisons (Mimeographed manuscript 
of 396 pages). Department of Mathematics, Princeton University, Princeton, 
New Jersey. 

Tukey, J. W. (1955). Answer to query no. 113. Biometrics, 11, 111-113. 

Tukey, J. W. (1957). On the comparative anatomy of transformations. Ann. Math. Stat., 
28, 602-632. 

Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley, Reading, Mas- 
sachusetts. 

Tukey, J. W. (1991). The philosophy of multiple comparisons. Stat. Sci., 6, 100-116. 

Ury, H. K. (1976). A comparison of four procedures for multiple comparisons among 
means (pairwise contrast) for arbitrary sample sizes. Technom., 18, 89-97. 

Ury, H. K., Stoline, M., and Mitchell, B. T. (1980). Further tables of the Studen- 
tized maximum modulus distribution. Commun. Stat., B: Simul. & Comp., 9, 167- 
178. 

Vanderbeck, J. P. and Cook, J. R. (1961). Extended Table of Percentage Points of the 


References 715 


Chi-Square Distribution. Nau. Rep. No. 7770, U.S. Naval Ordinance Test Station, 
China Lake, California. 

Verdooren, L. R. (1988). Exact tests and confidence intervals for ratio of variance 
components in unbalanced two- and three-stage nested designs. Commun. Stat., A: 
Theo. & Meth., 9, 1197-1230. 

Vidmar, T. J. and Brunden, M. N. (1980). Optimal allocation with fixed power in 
a completely randomized design with levels of subsampling. Commun. Stat., A: 
Theo. & Meth., 9, 757-763. 

Walker, H. M. (1940). Degrees of freedom. J. Educ. Psychol., 31, 253-269. 

Wang, S.-G. and Chow, S.-C. (1994). Advanced Linear Models: Theory and Applica- 
tion. Marcel Dekker, New York. 

Weekes, A. J. (1983). A Genstat Primer. Arnold, London. 

Weerahandi, S. (1995). ANOVA under unequal error variances. Biometrics, 51, 589- 
599. 

Welch, B. L. (1936). The specification of rules for rejecting too-variable a product, with 
particular reference to an electric lamp problem. J. R. Stat. Soc., Suppl., 3, 29-48. 

Welch, B. L. (1937). On the Z-test in randomized blocks and Latin squares. Biometrika, 
29, 21-52. 

Welch, B. L. (1956). On linear combinations of several variances. J. Amer. Stat. Assoc., 
51, 132-148. 

Wheeler, R. E. (1974). Portable power. Technom., 16, 193-201. 

Wilcox, R. R. (1988). A new alternative to the ANOVA F and new results on James’ 
second-order method. Brit. J. Math. Stat. Psychol., 41, 109-117. 

Wilcox, R. R. (1993). Robustness in ANOVA. In: Applied Analysis of Variance in 
Behavioral Science, pp. 345-374 (Ed. L. K. Edwards). Marcel Dekker, New York. 

Wilk, M. B. (1955). Linear Models and Randomized Experiments. Ph.D. Thesis, lowa 
State College, Ames, Iowa. 

Wilk, M. B. and Kempthorne, O. (1955). Fixed, mixed and random models. J. Amer. 
Stat. Assoc., 50, 1144-1167 (Corrigenda 51, 652.) 

Wilk, M. B. and Kempthorne, O. (1956). Some aspects of the analysis of factorial 
experiments in a completely randomized design. Ann. Math. Stat., 27, 950-985. 

Wilk, M. B. and Kempthorne, O. (1957). Non-additivities in a Latin square design. 
J. Amer. Stat. Assoc., 52, 218-236. 

Williams, J. S. (1962). A confidence interval for variance components. Biometrika, 49, 
278-281. 

Winer, B. J. (1962, 1971). Statistical Principles in Experimental Design. \st & 2nd eds. 
McGraw-Hill, New York. 

Winer, B. J., Brown, D. R., and Michels, K. M. (1991). Statistical Principles in 
Experimental Design, 3rd ed. McGraw-Hill, New York. 

Wolach, A. H. (1983). BASIC Analysis of Variance Programs for Microcomputers. 
Brooks/Cole, Montery, California. 

Yates, F. (1933). The analysis of replicated experiments when the field results are 
incomplete. Empor. J. Exper. Agri., 1, 129-142. 

Yates, F. (1934). The analysis of multiple classification with unequal numbers in the 
different classes. J. Amer. Stat. Assoc., 29, 51-66. 

Yates, F. (1936a). Incomplete randomized blocks. Ann. Eug., 7, 121-140. 

Yates, F. (1936b). Incomplete Latin squares. J. Agri. Sci., 26, 301-315. 

Yates, F. (1937a). A further note on the arrangement of variety trials: Quasi-Latin 
squares. Ann. Eug., 7, 319-332. 


716 The Analysis of Variance 


Yates, F. (1937b). The Design and Analysis of Factorial Experiments. Imperial Bureau 
of Soil Science, Harpenden, England. 

Yates, F. and Hale, R. W. (1939). The analysis of Latin squares when two or more 
rows, columns, or treatments are missing. J. R. Stat. Soc., Suppl., 6, 67-69. 

Youden, W. J. (1937). Use of incomplete block replications in estimating tobacco- 
mosaic virus. Contr. Boyce Thompson Inst., 9, 41-48. 

Youden, W. J. (1940). Experimental designs to increase accuracy of greenhouse studies. 
Contr. Boyce Thompson Inst., 11, 219-228. 

Youden, W. J. (1951). Statistical Methods for Chemists. John Wiley, New York. 

Zar, J. H. (1996). Biostatistical Analysis, 3rd ed. Prentice-Hall, Upper Saddle River, 
New Jersey. (1st ed., 1974; 2nd ed., 1984.) 


Author Index 


Abraham, J. K., 503, 689 

Afifi, A. A., 147, 689 

Afsarinejad, K., 522, 700 

Akutowicz, F., 444, 689 

Alexander, R. A., 86, 689 

Algina, J., 86, 108, 689, 694, 708 

Alldredge, J. R., 521, 709 

Allen, R. E., 146, 689 

Amara, I. A., 523, 704 

Anderson, D. L., 368 

Anderson, R. L., 66, 147, 314, 316, 428, 
429, 456, 457, 494, 514, 530, 
536, 579, 580, 689, 705 

Anderson, S. L., 85, 106, 691 

Anderson, T. W., 9, 689 

Anderson, V. L., 307, 523, 527, 528, 656, 
689, 706 

Andrews, D. M., xi, 689 

Anionwu, E., 121, 122, 689 

Anscombe, F. J., 110, 111, 689 

Applebaum, M. E., 111, 692 

Armitage, J. V., 577, 689 

Arnold, J., 492, 705 

Arteaga, C., 143, 690 

Arvesen, J. N., 33, 86, 690 

Aster, R., 544, 690 

Atkinson, A. C., 485, 690 


Bagui, S. C., 574, 690 

Bailey, B. J. R., 79, 647, 690 

Balaam, L. N., 485, 697 

Balakrishnan, N., 59, 60, 224, 574, 
575, 576, 703, 713 

Bancroft, T. A., 200, 222, 223, 314, 316, 
428, 429, 530, 536, 689, 690, 692, 
706 

Bankier, J. D., 307, 690 

Barcikowski, R. S., 633 

Barnett, V. D., 97, 100, 690 

Bartlett, M. S., vii, 98, 109, 110, 111, 690 

Beall, G., 110, 690 


Bechhoefer, R. E., 577, 653, 690 

Beckman, R. J., 97, 690 

Bennett, C. A., 8, 147, 307, 435, 461, 
464, 470, 474, 690 

Berry, D. A., 109, 174, 690 

Birch, N. J., 291, 365, 366, 690, 692, 701 

Bishop, D. J., 100, 691 

Bishop, T. A., 86, 87, 691 

Blackwell, T., 237, 238, 307, 691 

Blair, R. C., 108, 689 

Blischke, W. R., 314, 691 

Bliss, C., 365, 427, 428, 691 

Boardman, T. J., 32, 691 

Bock, R. D., 307, 691 

Bolk, R. J., 263, 691 

Boneau, C. A., 85, 691 

Bose, R. C., 507, 691 

Bowker, A. H., 61, 685, 691 

Bowman, K. O., 62, 64, 93, 228, 691, 
703, 708 

Box, G. E. P., 9, 85, 86, 87, 106, 109, 
112, 168, 278, 311, 371, 372, 519, 
524, 691, 692 

Bozivich, H., 200, 692, 712 

Bradley, J. V., 85, 692 

Brailsford, E. A., 107, 709 

Bratcher, T. L., 64, 635, 692 

Broemeling, L. D., 9, 692 

Brown, C., 237, 238, 307, 691 

Brown, D. R., 29, 30, 485, 521, 582, 
715 

Brown, M. B., 86, 87, 107, 692 

Brown, R. A., 86, 692 

Brownlee, K. A., 407, 408, 453, 454, 
464, 692 

Brozovic, M., 121, 122, 689 

Brunden, M. N., 400, 715 

Budescu, D. V., 111, 692 

Bulgren, W. G., 575, 692 

Bulmer, M. G., 32, 692 

Burch, L., 544, 692 


717 


718 


Burdick, R. K., 32, 33, 38, 143, 144, 208, 
209, 226, 244, 245, 291, 295, 297, 
353, 358, 365, 366, 367, 400, 403, 
406, 407, 580, 690, 692, 693, 701, 
714 


Cameron, J. M., 580, 693 

Casella, G., 9, 29, 193, 225, 226, 403, 
492, 580, 710, 711 

Chambers, J. M., 89, 693 

Chao, M.-T., 100, 693 

Charles, J. A., 121, 701 

Chen, H. J., 93, 711 

Cheng, S. S., 365, 713 

Chew, V., 64, 693 

Chow, S.-C., 8, 715 

Christensen, R., 8, 693 

Clark, V. A., ix, 4, 696 

Cleveland, W. S., 89, 693 

Clinch, J. J., 86, 108, 693, 703 

Coakes, S. J., 550, 693 

Cochran, W. G., 84, 85, 90, 105, 106, 
109, 147, 148, 223, 238, 290, 378, 
379, 484, 485, 493, 497, 503, 507, 
509, 516, 519, 520, 521, 524, 582, 
596, 603, 676, 693, 712 

Cody, R. P., 544, 693 

Cohen, A., 98, 694 

Cohen, J., 62, 694 

Collins, L. M., 30, 695 

Collyer, C. E., 543, 694 

Conover, W. J., 9, 108, 694 

Cook, J. R., 571, 714 

Cook, R. D., 97, 690 

Cooley, W. W., 307, 694 

Coombs, W. T., 86, 108, 689, 694 

Cornell, J., 311, 704 

Cornfield, J., 181, 307, 461, 464, 471, 
694 

Cox, D. R., 109, 112, 259, 278, 484, 485, 
503, 519, 521, 691, 694, 701 

Cox, G. M., 147, 148, 290, 484, 485, 497, 
507, 509, 516, 519, 520, 521, 522, 
524, 603, 693 

Craig, A. T., 572, 702 

Crisler, L., 550, 694 

Crowder, M. J., 522, 694 

Crump, S. L., 4, 277, 580, 694 


Author Index 


Cummings, W. B., 365, 694 
Curtiss, J. H., 109, 694 


D’ Agostino, R. B., 93, 96, 97, 658, 694, 
708 

Daly, F., x1, 699 

Damon, R. A., Jr., 343, 368, 411, 412, 
516, 537, 694 

Daniel, W. W., 9, 694 

Daniels, H. E., 4, 694 

Dantzker, M., 550, 706 

Das, M. N., 516, 694 

Davenport, J. M., 291, 694, 695 

David, H. A., 105, 663, 695 

Davies, O. L., 151, 279, 280, 345, 504, 
509, 520, 695 

Day, S. J., 60, 62, 695 

Delaney, H. D., 521, 706 

DeLury, D. B., 503, 695 

Dempster, A. P., 147, 695 

Dénes, J., 505, 695 

Derr, G., 544, 696 

Desmond, D. J., 455, 695 

Dickey, D. A., 485, 712 

Dilorio, F. C., 544, 695 

Dijkstra, J. B., 86, 695 

Dixon, W. J., 307, 558, 695 

Dobson, A. J., 8, 695 

Dodge, Y., 147, 509, 695 

Donaldson, T. S., 85, 695 

Donner, A., 31, 38, 581, 695 

Donoghue, J. R., 30, 695 

Draper, N. R., vii, 109, 223, 311, 691, 
695 

Dudewicz, E. J., 86, 87, 691 

Duncan, A: J., 58, 696 

Duncan, D. B., 81, 644, 696 

Dunlop, G., 509, 696 

Dunn, O. J., ix, 4, 79, 80, 577, 696 

Dunnett, C. W., 81, 83, 84, 86, 577, 
641, 690, 696 

Dyer, D. D., 100, 102, 103, 659, 662, 
696 

Dykstra, Jr., O., 344 


Efron, B., 109, 696 
Eickman, J., 38, 692 
Eisen, E. J., 579, 696 


Author Index 


Eisenhart, C., 4, 106, 109, 664, 696 
Elashoff, J. D., 523, 704 

Elashoff, R. M., 147, 689 
El-Bassiouni, Y., 366, 711 

Elliott, R. J., 544, 696 

Enns, J. T., 543, 694 

Euler, L., 507, 696 

Evans, M. A., 521, 709 

Everitt, B., 544, 696 


Fawcett, R. F, 8, 475, 711 

Federer, A. J., 485, 697 

Federer, W. T., 222, 485, 516, 520, 522, 
697 

Feldt, L. S., 62, 686, 688, 697 

Fidell, L. S., 543, 713 

Fisher, L., ix, 527, 531, 697 

Fisher, M., 550, 700 

Fisher, R. A., vii, 1, 2, 3, 4, 66, 77, 112, 
484, 487, 488, 497, 504, 507, 
537, 538, 569, 572, 573, 580, 
581, 582, 697 

Flecknell, P. A., 95, 710 

Fleiss, J. L., 60, 79, 485, 516, 521, 
697 

Folks, L., 5, 22, 703 

Forsythe, A. B., 86, 87, 107, 692 

Fox, M., 58, 697 

Francia, R. S., 94, 95, 711 

Franklin, N. L., 8, 147, 307, 435, 461, 
464, 470, 474, 690 

Freeman, M. F., 109, 110, 111, 697 

Freund, R. J., 8, 544, 697, 705 

Friendly, M., 544, 697 

Frude, N., 550, 698 


Gabriel, K. R., 73, 698 

Gaebelin, J., 213, 701 

Gallo, J., 227, 698 

Galton, F. F, vii 

Games, P. A., 80, 82, 84, 86, 87, 
108, 651, 698, 702, 703, 704 

Ganguli, M., 401, 698 

Gartside, P. S., 106, 698 

Gates, C. E., 407, 698 

Gayen, A. K., 85, 698 

Gaylor, D. W., 8, 290, 365, 475, 579, 
694, 698 


719 


Geary, R. C., 85, 92, 698 

Ghosh, M. N., 263, 698 

Gibbons, J. D., 23, 698, 709 

Gill, J. L., 485, 698 

Giri, N. C., 516, 694 

Glaser, R. E., 100, 693, 698 

Glass, G. V., 85, 86, 307, 698, 707 

Glass, S., 503, 704 

Glen, W. A., 147, 698 

Goldsmith, P. L., 279, 280, 695 

Gosset, W. S., 569 

Gosslee, D. G., 219, 698 

Govern, D. M., 86, 689 

Govindarajulu, Z., 574, 700 

Gower, J. C., 407, 698 

Graham, D. F,, 60, 62, 695 

Graybill, F. A., 8, 22, 27, 28, 29, 32, 33, 
37, 38, 57, 135, 143, 144, 182, 
193, 208, 209, 225, 226, 227, 
244, 245, 263, 264, 291, 295, 
297, 353, 358, 363, 366, 367, 
374, 375, 400, 403, 406, 407, 
503, 580, 601, 690, 692, 693, 
698, 699, 703, 706, 707, 711, 
714 

Greenberg, B. G., 484, 699 

Greenway, A. R., 509, 708 

Groggel, D. J., 34, 699 

Guenther, W. C., ix, 699 


Hahn, G. J., 485, 577, 699 

Hald, A., 569, 571, 573, 699 

Hale, R. W., 503, 716 

Hall, I. J., 108, 699 

Halvorsen, K. T., 328, 699 

Hampel, F. R., 97, 699 

Han, C. P., 200, 706 

Hand, D. J., xi, 522, 694, 699 

Hardy, K. A., 544, 695 

Hardyck, C. D., 86, 708 

Harsaae, E., 100, 699 

Harter, H. L., 95, 571, 577, 699 

Hartley, H. O., 58, 79, 91, 92, 100, 104, 
105, 106, 200, 307, 569, 571, 
573, 577, 638, 655, 680, 692, 699, 
700, 708 

Hartung, J., 353, 700 

Hartwell, T. D., 8, 475, 698 


720 


Harvey, W. R., 343, 368, 411, 412, 
516, 537, 694 

Harville, D. A., 29, 181, 700 

Hastay, M. W., 664, 696 

Hatcher, L., 544, 700 

Hawkins, D. M., 97, 700 

Hayman, G. E., 574, 700 

Hayter, A. J., 82, 700 

Healy, M. J. R., 147, 700 

Hedayat, A., 522, 700 

Hedderson, J., 550, 700 

Hegemann, V., 263, 700 

Heiberger, R. M., 543, 700 

Hemmerle, W. J., 307, 700 

Henderson, C. R., 224, 307, 700 

Hendrickson, R. W., 577, 699 

Hendy, M. F.,, 121, 701 

Herbach, L. H., 206, 701 

Hernandez, R. P., 365, 366, 701 

Herr, D. G., 213, 701 

Herson, J. H., 111, 704 

Herzberg, A., xi, 689 

Herzberg, A. M., 485, 701 

Herzberg, P. A., 544, 701 

Hicks, C. R., 390, 391, 485, 701 

Hinkelmann, K., 146, 168, 485, 492, 
503, 516, 701, 705 

Hinkley, D. V., 8, 701 

Hirotsu, C., 150, 223, 224, 263, 701 

Hoaglin, D. C., 89, 109, 701 

Hochberg, Y., 64, 83, 84, 268, 577, 
701 

Hocking, R. R., 8, 181, 226, 701, 
702 

Hodges, J. L., Jr., 23, 702 

Hoel, D. G., 62, 703 

Hogg, R. V., 572, 702 

Hollander, M., 9, 702 

Holm, S., 79, 702 

Hopper, F. N., 290, 579, 698 

Horsnell, G., 87, 702 

Houseman, E. E., 66, 689 

Howe, R. B., 291, 707 

Howell, J. F, 82, 84, 698, 702 

Hoyle, M. H., 109, 147, 702 

Hsu, J., 64, 702 

Huber, P. J., 97, 702 

Huck, S. W., 215, 702 

Hudson, J. D., 290, 702 

Huitema, B. E., 582, 702 


Author Index 


Huitson, A., ix, 224, 702 

Hultquist, R. A., 28, 38, 699, 713 

Hunter, J. S., 371, 372, 519, 524, 692 

Hunter, W. G., 109, 223, 371, 372, 
519, 524, 692, 695 

Hussein, M., 365, 702 


Imhof, J. P., 268, 285, 702 


Jackson, R. W. B., 4, 702 

Jaffe, J. A., 544, 702 

James, G. S., 86, 87, 702 

Japanese Standards Association, 106, 702 

Jeyaratnam, S., 33, 143, 291, 690, 703, 
714 

John, J. A., 521, 532, 703 

John, P. W. M., 154, 531, 541, 542, 703 

Johnson, D. E., 252, 263, 264, 700, 703, 
707 

Johnson, M. E., 108, 694 

Johnson, M. M., 108, 694 

Johnson, N. L., 59, 60, 322, 323, 510, 
574, 575, 576, 703 

Jones, B. 521, 703 


Kafadar, K., 79, 703 

Kapadia, C. H., 580, 710 

Kastenbaum, M. A., 62, 64, 228, 691, 703 

Keating, J. P., 100, 102, 103, 659, 662, 
696 

Keedwell, A. D., 505, 695 

Kempthorne, O., 3, 5, 22, 146, 168, 223, 
264, 484, 485, 503, 516, 520, 524, 
701, 703, 715 

Kendall, D. G., 109, 690 

Kendall, M. G., 3, 14, 37, 311, 581, 703 

Kenward, M. G., 521, 703 

Keselman, H. J., 82, 86, 87, 108, 693, 
703, 705, 709 

Keselman, J. C., 87, 705 

Keuls, M., 80, 703 

Khargonkar, S. A., 514, 704 

Khun, A. I., 29, 224, 226, 227, 311, 365, 
407, 580, 698, 704, 710 

Khurshid, A., 141, 710 

Kihlberg, J. K., 111, 704 

Kimball, A. W., 200, 704 

King, S. J., 544, 692 

Kirk, R. E., 485, 521, 641, 704 


Author Index 


Kirkwood, B., 121, 122, 689 

Kleffe, J., 580, 709 

Kleiner, B., 89, 693 

Klett, G. W., 32, 713 

Klotz, J. H., 73, 704 

Koch, G. G., 523, 704 

Koehler, K. J., 79, 90, 704 

Kohr, R. L., 86, 87, 704 

Kotz, S., 59, 60, 574, 575, 576, 703 

Koval, J. J., 38, 695 

Kramer, C. Y., 82, 147, 503, 698, 704 

Krishnaiah, P. R., 9, 64, 263, 577, 689, 
704 

Krutchkoff, R. G., 86, 87, 269, 290, 702, 
705 

Kurtz, T. E., 64, 705 

Kussmaul, K., 580, 705 

Kutner, M. H., 8, 503, 609, 707 


Lambert, J. W., 516 

LaMotte, L. R., 224, 705 

Larntz, K., 90, 704, 705 

Larson, R., 411 

Laubscher, N. F, 100, 705 

Layard, M. W. J., 86, 168, 690, 705 

Layne, B. H., 215, 702 

Lee, P. M., 9, 705 

Lehmann, E. L., 9, 23, 702, 705 

Lehmer, E., 57, 705 

Lentner, M., 492, 705 

Leone, F. C., 322, 323, 510, 574, 700, 703 

Levene, H., 107, 705 

Levine, G., 590, 705 

Levy, K. J., 86, 108, 705 

Lewis, T., 97, 690 

Lexis, W. H.R. A., 3 

Lieberman, G. J., 61, 574, 685, 691, 709 

Lindman, H. R., 9, 705 

Lindsey, J. K., 522, 705 

Link, R. F, 64, 705 

Linnerud, A. C., 233 

Linthurst, R. A., 533 

Littell, R. C., 8, 224, 522, 544, 545, 697, 
704, 705, 711 

Lix, L. M., 87, 705 

Lohnes, P. R., 307, 694 

Lord, N. M., 147, 695 

Lorenzen, T. J., 291, 307, 484, 527, 
528, 656, 705, 706 

Lu, T. F. C., 33, 714 


721 


Lucas, H. L., 219, 698 
Lunn, D., xi, 699 
Lurigio, A., 550, 706 


Mackenzie, W. A., 4, 697 

Mahalanobis, P. C., 3, 706 

Mahamunulu, D. M., 403, 706 

Mahmoud, M. W., 62, 686, 688, 697 

Malone, L. C., 4, 708 

Mandel, J., 263, 706 

Maqsood, F., 38, 693 

Marcuse, S., 391, 392, 400, 706 

Mardia, K. V., 573, 706 

Massey, F. J., 79, 577, 696 

Mathew, T., 226, 704 

Maurais, J., 100, 706 

Maxwell, S. E., 521, 706 

McCabe, G. P., 492, 710 

McConway, K., xi, 699 

McCullagh, P., 8, 706 

McCulloch, C. E., 9, 29, 193, 225, 226, 
403, 580, 711 

McDonald, J., ix, 527, 531, 697 

McHugh, R. B., 8, 706 

McLean, R. A., 226, 523, 689, 706 

Mead, R., 200, 706 

Mensing, R. W., 263, 708 

Merrington, M., 617 

Micceri, T., 108, 706 

Michels, K. M., 29 30, 485, 521, 582, 
715 

Mielke, P. W., Jr., 8, 706 

Miller, J. J., 64, 206, 706, 713 

Miller, R. G., Jr., 23, 33, 64, 79, 82, 85, 
107, 268, 577, 706 

Milliken, G. A., 252, 263, 264, 365, 503, 
522, 545, 702, 705, 706, 707 

Millman, J., 307, 707 

Miron, T., 544, 707 

Mitchell, B. T., 577, 714 

Miyakawa, M., 263, 707 

Montgomery, D. C., 146, 311, 707 

Moran, M. A., 64, 635, 692 

Morrison, D. F.,, 9, 707 

Moses, L. E., 79, 707 

Mosteller, F., 89, 172, 237, 238, 307, 691, 
701, 707 

Murdock, G. R., 32, 707 

Murray, L. W., 264, 712 

Myers, R. H., 291, 311, 707 


722 


Nachtsheim, C. J., 8, 609, 707 
Nagasenkar, P. B., 100, 707 
Naik, U. D., 291, 295, 707 
Nair, K. R., 509, 577, 707 
Nair, U. S., 100, 691 

Natrella, M. G., 109, 520, 707 
Nelder, J. A., 8, 706 

Nelson, L. S., 64, 707 

Neter, J., 8, 503, 609, 707 
Newman, D., 80, 708 

Nie, N. H., 550 

Norton, H. W., 495, 708 


O’Brien, R. G., 108, 708 

Olejnik, S. F., 108, 708 

Olkin, I., 30, 708 

Oltman, D. O., 86, 694 

O’Neil, R., 64, 708 

Ord, J. K., 14, 37, 311, 703 

Ostle, B., 4, 263, 341, 342, 708 

Ostrowski, E., xi, 699 

Owen, D. B., 105, 112, 569, 574, 577, 
620, 671, 708 


Parker, E. T., 507, 708 

Paull, A. E., 200, 708 

Pearson, E. S., 57, 58, 79, 85, 91, 92, 93, 
100, 105, 106, 569, 570, 571, 573, 
577, 644, 655, 680, 708 

Pearson, K., 3, 570 

Peckham, P. D., 85, 86, 698 

Peng, K. C., 146, 307, 708 

Perry, J. N., 509, 708 

Petrinovich, L. F, 86, 708 

Pillai, K. C. S., 577, 708 

Pitman, E. J. G., 168, 709 

Plackett, R. L., 4, 481, 709 

Pratt, J. W., 23, 30, 698, 708, 709. 

Probert, D. A., 108, 698 

Pukelsheim, F., 224, 709 


Quenouille, M. H., 521, 532, 703 
Quimet, R., 100, 706 


Ramachandran, K. V., 577, 708 
Ramsey, P. H., 107, 108, 709 
Rankin, N. O., 219, 709 

Rao, C. R., 8, 224, 263, 580, 709 


Author Index 


Rao, P. S.R. S., 225, 580, 709 

Rao, P. V., 34, 699 

Ratkowski, D. A., 521, 709 

Reid, N., 8, 701 

Resnikoff, G. J., 574, 709 

Ringland, J. T., 86, 709 

Robertson, A., 37, 709 

Rochetti, E. M., 97, 699 

Rogan, J. C., 82, 86, 87, 703, 709 

Rohlf, F. J., 122, 123, 391, 392, 394, 
417, 418, 712 

Rosenthal, R., 64, 709 

Rosnow, R. L., 64, 709 

Rousseeuw, P. J., 97, 699 

Royston, J. P., 93, 95, 96, 709, 710 

Rubin, D. B., 147, 695, 710 


Sahai, H., 8, 28, 29, 61, 86, 141, 181, 
356, 580, 704, 710, 712 

Samuels, M. L., 492, 710 

Sanders, J. R., 85, 86, 698 

Sanders, W. L., 226, 706 

SAS Institute, 82, 520, 544, 545, 710 

Satterthwaite, F. E., 289, 308, 578, 710 

Scheffé, H., ix, 4, 22, 32, 58, 71, 73, 85, 
87, 88, 135, 181, 193, 263, 264, 
267, 268, 278, 279, 299, 307, 344, 
363, 399, 401, 582, 710, 711 

Schervish, M. L., 9, 711 

Schlotzhauer, S. D., 544, 711 

Schmitz, T. H., 33, 86, 690 

Schultz, E. F., Jr., 286, 305, 307, 439, 
441, 456, 711 

Schutz, W. E., 111, 704 

Searle, S. R., 8, 9, 29, 33, 37, 135, 193, 
218, 221, 223, 224, 225, 226, 363, 
366, 403, 475, 545, 580, 711 

Seely, J. F., 366, 711 


_ Seifert, B., 291, 711 


Sen, B., 366, 711 

Seneca, E. D., 533 

Seng, M., 550, 706 

Senn, S., 521, 711 

Shaffer, J. P., 79, 82, 711 

Shah, K. R., 509, 695 

Shapiro, S. S., 93, 94, 95, 656, 657, 711 
Sharma, D., 263, 698 

Shiue, C., 407, 698 


Author Index 


Shooter, M., 82, 703 

Shrikhande, S. S., 507, 691 

Siddk, A., 80, 711 

Sinecore, J., 550, 706 

Singh, B., 38, 711 

Singh, C., 86, 712 

Singhal, R. A., 34, 61, 86, 580, 711, 712 

Sinha, B. K., 226, 704 

Sinkbaek, S. A., 571, 699 

Smith, D. W., 264, 712 

Smith, H., vii, 109, 223, 695 

Smith, H. F., 226, 712 

Smith, J. K., 544, 693 

Snedecor, G. W., 48, 85, 147, 223, 238, 
378, 379, 503, 505, 516, 572, 
582, 596, 676, 712 

Snee, R. D., 532, 533, 712 

Snell, E. J., 8, 558, 701, 712 

Sokal, R. R., 122, 123, 391, 392, 394, 
417, 418, 712 

Solomon, H., 106, 696 

Spector, P. C., 8, 544, 705 

Spector, P. E., 544, 712 

Spjotvoll, E., 25, 83, 224, 712 

Sprent, P., 8, 712 

SPSS, Inc., 550, 551, 558, 565, 566, 
712 

Srivastava, A. B. L., 85, 200, 712 

Srivastava, S. R., 200, 712 

Stahel, W. A., 97, 699 

Steed, L. G., 550, 693 

Steel, G. D., 147, 233, 234, 393, 485, 
503, 516, 517, 526, 533, 534, 
538, 712 

Stepanski, E. J., 544, 700 

Stephens, M. A., 658 

Stoline, M. R., 64, 83, 84, 577, 654, 712, 
713, 714 

Strawderman, W. E., 98, 694 

Street, A. P., 504, 505, 713 

Street, D. J., 505, 713 

Stroup, W. W., 226, 522, 545, 705, 706, 
713 

Stuart, A., 3, 14, 37, 311, 581, 703 

Szatrowski, T. H., 206, 713 


Tabachnick, B. G., 543, 713 
Tabatabai, M. A., 87, 713 


723 


Tamhane, A. C., 64, 83, 84, 268, 577, 
701, 713 

Tan, W. Y., 61, 86, 87, 224, 264, 365, 713 

Tang, P. C., 57, 713 

Tate, R. F., 32, 713 

Theune, J. A., 97, 713 

Thiele, T. N., 3 

Thomas, J. D., 38, 713 

Thompson, C. M., 611, 617 

Thompson, W. O., 28, 710 

Thomsen, I. B., 224, 713 

Thoni, H., 109, 713 

Tiao, G. C., 9, 692 

Tietjen, G. L., 365, 713 

Tiku, M. L., 57, 59, 85, 574, 575, 
576, 629, 714 

Ting, N., 33, 291, 366, 690, 711, 714 

Tippett, L. H. C., 4, 504, 509, 714 

Tiwari, C. B., 86, 580, 712 

Tocher, K. D., 147, 714 

Toothaker, L. E., 64, 82, 703, 714 

Torrie, J. H., 147, 233, 234, 393, 485, 
503, 516, 517, 526, 533, 534, 538, 
712 

Toutenburg, H., 8, 709 

Traux, H. M., 444, 689 

Tukey, J. W., 5, 64, 71, 79, 82, 89, 109, 
110, 111, 172, 181, 262, 264, 307, 
461, 464, 471, 481, 503, 694, 697, 
701, 703, 705, 707, 714 

Tukey, P. A., 89, 693 


Urey, F. R., 526 
Ury, H. K., 83, 577, 713, 714 


Vanderbeck, J. P., 571, 714 
Verdooren, L. R., 366, 715 
Vidmar, T. J., 400, 715 
Voet, B., 353, 700 


Wackerly, D. D., 34, 699 
Walker, H. M., 16, 715 

Wall, C., 509, 708 

Wallace, D. L., 64, 705 

Wallis, W. A., 664, 696 

Wang, S.-G., 8, 715 

Wasserman, W., 8, 503, 609, 707 
Watford, D., 121, 122, 689 


724 


Webester, J. T., 291, 695 

Weekes, A. J., 173, 715 

Weerahandi, S., 84, 98, 715 

Welch, B. L., 86, 87, 168, 289, 715 

Wells, G., 38, 695 

Werter, S. P. J., 86, 695 

Westmacott, M., 147, 700 

Wetherill, G. B., 64, 708 

Wheeler, R. E., 61, 715 

Wilcox, R. R., 86, 715 

Wilk, M. B., 93, 181, 223, 264, 503, 
656, 657, 711, 715 

Williams, J. S., 32, 715 

Williford, W. O., 32, 707 


Winer, B. J., 29, 30, 82, 485, 521, 582, 


715 
Winkler, H. B., 108, 698 
Wishart, J., 146, 689 


Author Index 


Wolach, A. H., 543, 715 
Wolfe, D. A., 9, 702 
Wolfinger, R. D., 522, 545, 705 
Wong, S. P., 61, 86, 713 
Wootton, R., 95, 710 
Wortham, A. W., 28, 699 


Yates, F., 3, 66, 112, 146, 217, 220, 368, 
497, 503, 507, 509, 519, 520, 541, 
569, 573, 697, 715, 716 

Yochmowitz, M. G., 263, 704 

Youden, W. J., 247, 248, 520, 716 


Zar, J. H., 86, 716 

Zelen, M., 222, 697 
Zemroch, P. J., 573, 706 
Zimmer, W. J., 64, 635, 692 


Subject Index 


Additivity. See Nonadditivity entries 
Aggregate variation, 2 
Agricultural experiments, 4 
Alternate mixed models, 264—268 
Alternative hypothesis, defined, 597 
Analysis of covariance, 581-582 
Analysis of variance (ANOVA), 
for completely randomized design, 
485-486 
for Graeco-Latin square design, 
507-509 
historical developments in, 3-4 
for Latin square design, 498-500 
methodology, 1 
nonparametric, 9 
for partially nested classifications, 
433-436 
for randomized block design, 490-493 
for split-plot design, 513-516 
for three-way nested classification, 
396-399 
for two-way nested (hierarchical) 
classification, 350-351 
for unequal numbers of observations, 
in one-way classification, 36-39 
for various other designs or models. 
See under a specific design or 
model 
term, 2 
using BMDP, 558-559 
using SAS, 543-550 
using SPSS, 550-558 
using Statistical computing packages, 
543-567 
Analysis of variance F test. See F test 
Analysis of variance (ANOVA) models, 
defined, 4-5 
finite population, 7-8, 461—482. See 
also finite population models 
entries 


fixed effects. See Fixed effects model 
(Model I) 
linear, 4—5 
mixed. See Mixed model (Model III) 
Model I. See Fixed effects model 
(Model I) 
Model II. See Random effects model 
(Model II) 
Model III. See Mixed model 
(Model IIT) 
with multivariate response, 9 
one-way. See One-way classification 
random effects. See Random effects 
model (Model II) 
rules for determining, 588-590 
univariate, 8 
variance components. See Random 
effects model (Model II), Mixed 
model (Model III) 
Analysis of variance table, defined, 26 
ANOVA. See Analysis of variance 
(ANOVA) entries 
ANOVA procedure, in SPSS, 164, 252, 
333, 551, 552 
program and output, 165, 489 
Approximate confidence interval, 
defined, 600 
in two-way crossed finite population 
model, 470 
Approximate test, defined, 598 
Arcsine transformation, 110, 112 
Assumption of normality, 20, 135, 193, 
286, 351, 399, 407 


Balanced incomplete block design, 519 
Balanced lattice design, 520 
Balancing experimental units, 484 
Bartlett’s test for homogeneity of 
variances, 98-104, 106—107 
table of critical values of, 659-662 


725 


726 


Bayesian inference, 9 
Behrens-Fisher problem, 84 
Best linear unbiased estimates (BLUE), 
in one-way classification, 27 
in two-way crossed classification with 
interaction, 204 
in two-way crossed classification 
without interaction, 140 
Beta statistic, 168 
Between group sum of squares, 15, 35-36 
Biomedical Programs. See BMDP 
Block design 
incomplete, 516, 519 
randomized. See Randomized block 
design 
Blocking experimental units, 484 
Blocks fixed 
treatments fixed and, 490—492 
treatments random and, 493 
Blocks random | 
treatments fixed and, 492-493 
treatments random and, 492 
BLUE. See Best linear unbiased estimates 
BMD (Biomedical) package, 558 
BMDP (Biomedical Programs), 543. See 
also Statistical computing 
packages 
analysis of variance using, 558-559 
BMDP 3D, 1V, 4V, and 5V, 559 
BMDP 7D, 52, 164, 252, 253, 559, 565, 
567 | 
program and output, 53, 54, 254, 489 
BMDP 2V, 164, 252, 253, 334, 381, 451, 
558, 559 
program and output, 165, 255, 335, 
496, 506, 511 
BMDP 3V, 52, 164, 252, 381, 421, 451, 
559 
program and output, 56, 385, 424 
BMDP 8V, 52, 55, 164, 252, 253, 381, 
421, 451, 559 
program and output, 55, 166, 167, 257, 
259, 336, 338, 382, 383, 386, 422, 
425, 451,519 
Bonferroni inequality, 78 
Bonferroni ¢ statistic, 78-79, table of 
critical values of, 645-647 
Bonferroni’s test/method/interval 
in one-way classification, 78-79, 80 


Subject Index 


in three-way crossed classification, 
321-322 
in two-way crossed classification with 
interaction, 233 
in two-way crossed classification 
without interaction, 150 
in two-way nested classification, 
361-362, 365 
using statistical computing packages, 
561-563 
Boole inequality, 78 
Box-plots, 89 
Brown-Forsythe procedure, 107—108, 
566 
BY keyword, in a SPSS procedure, 
551-552 


C test for homogeneity of variances. See 
Cochran’s C test for homogeneity 
of variances. 

Calculators, electronic, x 

Cells, 126 

Central limit theorem, 368, 569 

Chi-square distribution, definition and 
properties of, 570-571 

table of critical values of, 610-611 
noncentral. See Noncentral chi-square 
distribution 

Chi-square goodness-of-fit test for 
normality, 89-90 

CLASS keyword/statement, in a SAS 
procedure, 524, 546-549 

Classification, term, 125-126 

Cochran’s C test for homogeneity of 
variances, 98, 105—107 

table of critical values of, 664 

Coefficient of kurtosis, 91-92 

table of critical values of sample 
estimate of, 655 

Coefficient of skewness, 90-91 

table of critical values of sample 
estimate of, 655 

Coefficients of order statistics for the 
Shapiro-Wilk’s W test for 
normality, table of, 656 

Column efficiency, of a Latin square 
design, 503 

Completely nested design, definition and 
examples of, 347-348, 395 


Subject Index 


Completely randomized design (CRD), 
485-488, 489 
analysis of variance for, 486—487 
mathematical model of, 486 
worked example for, 487, 488 
Components of variance. See Variance 
components 
Computational formulae and procedure 
for sums of squares 
in Latin square design, 502 
in One-way classification, 35-36 
in partially nested classifications, 437 
in three-way crossed classification, 
297-298 
in three- and four-way nested 
classifications, 396-397, 401, 
403-404 
in two-way crossed classification with 
interaction, 210-212 
in two-way crossed classification 
without interaction, 144-145 
in two-way nested (hierarchical) 
classification, 359, 362-363 
Computing power, use of statistical 
packages for, 560-561 
Concomitant variable, term, 581 
Confidence coefficient, defined, 599 
Confidence intervals for variance 
components, in one-way random 
effects model, 31-34 
Conservative confidence interval, 
defined, 600 
Conservative test, defined, 598 
Consistency, defined, 599 
Consistent estimator, defined, 599 
CONTRAST command/subcomman4d, in 
SAS ANOVA/GLM and SPSS 
ONEWAY/GLM procedures, 
562-565 
Contrasts, defined, 65 
test of hypothesis involving, 67—69 
Control, in experimental design, 
484-485 
Correlation, defined, 588 
Covariance, defined, 587-588 
analysis of, 581-582 
Covariate, term, 581 
CRD. See Completely randomized 
design 


727 


CRITERIA subcommand, in SPSS GLM 
and VARCOMP procedures, 558, 
561 
Critical range values, Studentized, 80 
Critical region, defined, 597 
Critical values, defined, 597 
of Bartlett’s test for homogeneity of 
variances, table of, 659-662 
of chi-square distribution, table of, 
610-611 
of Cochran’s C test for homogeneity of 
variances, table of, 664 
of D’Agostino’s D test for normality, 
table of, 658 
of Duncan’s multiple range test, table 
of, 642-644 
of Dunn-Siddk’s multiple comparison 
test, table of, 648-651 
of Dunnett’s test, table of, 639-641 
of F distribution, table of, 612—617 
of Hartley’s maximum F ratio test for 
homogeneity of variances, table 
of, 663 
of sample estimate of coefficient of 
kurtosis, table of, 655 
of sample estimate of coefficient of 
skewness, table of, 655 
of Shapiro-Wilk’s W test for 
normality, table of, 657 
of Studentized augmented range 
distribution, table of, 654 
of Studentized maximum modulus 
distribution, table of, 652-653 
of Studentized range distribution, table 
of, 636-638 
of Student’s ¢ distribution, table of, 
608-609 


Cross-nested design, defined, 431 
Cross-over design, 520-521 
Crossed classification, contrasted with a 


nested classification, 347 
term, 125-126 


Cumulative standard normal distribution, 


table of critical values of, 
605-606 


Curves of constant power for F tests in 


fixed effects model (Model I) for 
determination of sample size in a 
one-way classification, 686-688 


728 


D’ Agostino’s D test for normality, 96-97 
table of critical values of, 658 
Degrees of freedom, 2, 569, 570, 
572-576 
concept of, 15-17 
for error variance in a one-way 
classification, 125 
in partially nested classifications, 435 
rules for calculating, 590-591 
DESIGN keyword/subcommand, in 
SPSS GLM and MANOVA 
procedures, 552-557 
Detecting outliers, 97 
Doubly noncentral beta, related to doubly 
noncentral F’, 576 
Doubly noncentral F distribution, 262 
definition and properties of, 576 
Doubly noncentral ¢ distribution, 
definition and properties of, 575 
Duncan’s multiple range test, 81, 
561-563 
table of critical values of, 642-644 
Dunn-Sidak’s multiple comparison test, 
79-80, 562-653 
table of critical values of, 648~651 
Dunnett’s multiple comparison test, 
81-82, 563 
table of critical values of, 639-641 
Dunn’s multiple comparison test, 79 
table of critical values of, 645-647 


Effects parameter, 594 

Efficiency, defined, 599 

Efficient estimator, defined, 599 

Electronic calculators, x 

EMS (error mean square), its 
custom-made use for multiple 
comparisons using statistical 
packages, 561 

Equal variances, departures from, 86-88 

Error mean square (EMS). See EMS 
(error mean square) 

Error sum of squares, 129, 183, 286, 351, 
397, 435, 464, 475, 486, 491, 499, 
508, 513 

Error terms, 5, 589 

departures from independence of, 

88-89 

Error variance, degrees of freedom in 
one-way classification for, 125 


Subject Index 


Estimate, defined, 598 
Estimated total variance, in one-way 
random effects model, 29 
Estimator, defined, 598 
Exact confidence interval, defined, 600 
Exact statistical test, defined, 598 
EXAMINE procedure, in SPSS, 567 
Expectations of mean squares 
in fixed effects models (Model J), 
two-way crossed classification with 
interaction, 186-188 
two-way crossed classification 
without interaction, 131-132 
in mixed models (Model IID), 
in two-way crossed classification 
with interaction, 190-193 
in two-way crossed classification 
without interaction, 133 
in one-way Classification, 17-20 
in random effects model (Model IT) 
in two-way crossed classification 
with interaction, 188-190 
in two-way crossed classification 
without interaction, 132-133 
in three-way crossed classification, 286 
in two-way crossed classification with 
interaction, 184-193 
in two-way crossed classification 
without interaction, 130-134 
in various other designs or models. See 
under a specific design or model 
Expected mean squares, rules for finding, 
591-595 
Expected subclass numbers, method of, 
222 
Expected value, defined, 586-587 
Experimental data, collecting, 483 
Experimental designs, 
in agriculture and other sciences, 4 
literature on, 485 
principles of, 483-485 
some simple, 483-542 
Experimental unit, defined, 484 


F distribution, 23, 138, 198, 199, 201, 
202, definition and properties of, 
572-573 

table of critical values of, 612—617 
doubly noncentral, 262, definition and 
properties of, 576 


Subject Index 


F distribution (cont.) 
noncentral. See Noncentral 
F distribution 
F test(s) 
equivalent to paired ¢ test, 584-586 
equivalent to two-sample ¢ test, in 
one-way classification, 582-584 
in fixed effects model (Model I) 
curves of constant power for 
determination of sample size in 
one-way classification, 686—688 
in three-way crossed classification, 
288, 289 
in two-way crossed classification 
with interaction, 198-200 
in two-way crossed classification 
without interaction, 137-139 
power function charts of, 672-680 
in mixed model (Model III) 
in three-way crossed classification, 
291-292 
in two-way crossed classification 
with interaction, 202-203 
in two-way crossed classification 
without interaction, 139 
in one-way classification, 22—25 
in random effects model (Model IT) 
in one-way classification, table of 
power and optimum number of 
levels in, 630-633 
in three-way crossed classification, 
288-291, 292 
in two-way crossed classification 
with interaction, 200-202 
in two-way crossed classification 
without interaction, 139 
operating characteristic curves for, 
61, 681-685 
in three-way crossed classification, 
286, 288-292 
in two-way crossed classification with 
interaction, 197-203 
in two-way crossed classification 
without interaction, 137-139 
in two-way crossed finite population 
model, 466-467 
in two-way nested (hierarchical) 
classification, 351, 353, 363-365 
in various other designs or models. See 
under a specific design or model 


729 


power of. See Power of F test 
F value in an analysis of variance 
table, 26 
Feldt-Mahmoud charts, 62, 686-688 
Finite population models, 461-482 
four-way crossed, 472—473 
more complex, 481 
nested, 474-475 
one-way, 461-462 
Statistical computing packages in, 481 
three-way crossed, 470-472 
two-way crossed, 462—470 
unablanced, 475 
worked example for, 475-481 
Finite population theory, 8 
Finite populations, 7-8, 461 
Fisher’s Z distribution, related to F 
distribution, 572 
Fixed effects, 4, 5, concept of, 6—7 
Fixed effects analysis, in an unbalanced 
two-way crossed classification 
model, 215-223 
general case of unequal frequencies 
for, 217-223 
proportional frequencies for, 
215-217 
Fixed effects model (Model I), 4, 5, 
481 
curves of constant power for F tests for 
determination of sample size in 
one-way Classification, 686-688 
effects of violations of assumptions of 
two-way crossed classification 
with interaction, 269 
expectations of mean squares in, 
in two-way crossed classification 
with interaction, 186-188 
in two-way crossed classification 
without interaction, 131—132 
F tests 
in one-way classification, 22—24 
in three-way crossed classification, 
288, 289 
in two-way crossed classification 
with interaction, 198-200 
in two-way crossed classification 
without interaction, 137-139 
in various other designs or models. 
See under a specific design or 
model 


730 


Fixed effects model (Model I) (cont.) 
interval estimation 
in two-way crossed classification 
with interaction, 207-208 
in two-way crossed classification 
without interaction, 142-143 
in two-way nested (hierarchical) 
classification, 356-357 
one-way classification, table of 
minimum sample size per 
treatment group needed in, 
634-635 
point estimation in 
in two-way crossed classification 
with interaction, 203—205 
in two-way crossed classification 
without interaction, 
139-14] 
in two-way nested (hierarchical) 
classification, 354-355 
in various other designs or models. 
See under a specific design or 
model 
power function charts of F tests in, 
672-680 
power of F test in 
in one-way classification, 57-60 
in two-way crossed classification 
with interaction, 227-228 
in various other designs or models. 
See under a specific design or 
model 
sampling distribution of mean squares 
in 
in two-way crossed classification 
with interaction, 193, 195-196 
in two-way crossed classification 
without interaction, 135-136 
in various other designs or models. 
See under a specific design or 
model 
worked examples for 
in One-way classification, 39—43 
in three-way crossed classification, 
314-322 
in two-way crossed classification 
with interaction, 233-237 
for unequal sample sizes per 
cell, 237-244 


Subject Index 


in two-way crossed classification 
without interaction, 151-154 
in two-way nested (hierarchical) 
classification, 368-371 
in various other design or models. 
See under a specific design or 
model 
Four-factor partially nested classification, 
438-439, 440 
Four-way crossed classification, 302, 
304-307 
Four-way crossed finite population 
model, 472—473 
Four-way nested classification, 403—406 
Fox charts, 58 
Fractional replications, 523-524 


General case of unequal frequencies 
for fixed effects analysis, 217-223 
for random effects analysis, 223-226 
General constant, in a linear model, 5 
General linear models (GLM), 5, 8 
General g-way nested classification, 
406-407 

Generalized linear models, 8 

Generalized randomized block design 
(GRBD), 494 

GLM (General linear models), 5, 8 

GLM procedure, in SPSS, 52, 164, 252, 
253, 333, 334, 381, 421, 451, 
551-554, 556, 557 

program and output, 55, 56, 166, 167, 

256, 258, 336, 337, 383, 384, 386, 
422, 423, 425, 450, 518 

Graeco-Latin square design, 507-512 
analysis of variance for, 507-509 
mathematical model for, 507 
worked example in, 510-512 

Graeco-Latin squares, 507 
some more examples of, 602-603 

GRBD (generalized randomized block 

design), 494 
Grouping experimental units, 484 


Hartley’s maximum F ratio test for 
homogeneity of variances, 98, 
104-105, 106-107 

table of critical values of, 663 

Hierarchical models, partially, 431-460 


Subject Index 


Hierarchically nested design. See 
completely nested design 
Higher-order crossed classifications, 
307-311 
Homogeneity of variances 
Bartlett’s test for, 98-104, 106-107 
table of critical values of, 659-662 
Cochran’s C test for, 98, 105-107 
table of critical values of, 664 
Hartley’s maximum F ratio test for, 
98, 104-105, 106-107 
table of critical values of, 663 
other tests for, 107-108 
HOMOGENEITY option, in SPSS 
procedures, 566 
Homoscedasticity 
tests for, 97-108. See also 
Homogeneity of variances 
entries 
use of statistical packages for, 
565-567 
transformations to correct lack of, 
110-113 
HOVTEST option, in SAS GLM 
procedure, 566 
Hyper-Graeco-Latin square design, 522 
Hypersquares, 522 
Hypothesis testing, general procedure of, 
597-598 


Incomplete block design, 516, 519 
Independence of error terms, departures 
from, 88-89 
Independent variable, term, in analysis of 
covariance, 581 
Infinite population theory, 461 
Infinite populations, 7-8, 461 
Interaction, defined, 177 
meaning and interpretation of, 253, 
256-259 
with one observation per cell, 
259, 261-264 
significant, 257 
two-way crossed classification with. 
See Two-way crossed 
classification with interaction 
two-way crossed classification 
without. See Two-way crossed 
classification without interaction 


731 


Interaction effect, in a two-way crossed 
classification model, 180 
Interaction sum of squares, in a two-way 
crossed classification model, 183 
Interaction terms, in an analysis of 
variance model, 589 
Interval estimation, general method of, 
599-601 
in fixed effects model (Model I) 
in two-way crossed classification 
with interaction, 207—208 
in two-way crossed classification 
without interaction, 142-143 
in two-way nested (hierarchical) 
classification, 356-357 
in various other designs or models. 
See under a specific design or 
model 
in Latin square design, 500-501 
in mixed model (Model IID) 
in two-way crossed classification 
with interaction, 209-210 
in two-way crossed classification 
without interaction, 143-144 
in two-way nested (hierarchical) 
classification, 359 
in various other designs or models. 
See under a specific design or 
model 
in random effects model (Model II) 
in two-way crossed classification 
with interaction, 208-209 
in two-way crossed classification 
without interaction, 143 
in two-way nested (hierarchical) 
classification, 358 
in various other designs or models. 
See under a specific design or 
model 
in three-way crossed classification, 
292-297 
in two-way crossed classification with 
interaction, 207—210 
in two-way crossed classification 
without interaction, 142-144 
in two-way crossed finite population 
model, 468—470 
in two-way nested (hierarchical) 
classification, 356-359, 365-367 


732 


Interval estimation (cont.) 
in various other designs or models. See 
under a specific design or model 
Intraclass correlation, definition and 
properties of, 580-581 
Intraclass correlation coefficient, defined, 
581 
intraclass correlations 
in one-way classification, 29, 30, 34, 
38 
in two-way crossed classification with 
interaction, 182 
in two-way crossed classification 
without interaction, 128 
Inverse hyperbolic sine transformation, 
110n 


Jackknife technique, in finding 
confidence interval of a variance 
component 33n, in testing 
homogeneity of variances, 
107-108 


Kruskal-Wallis test, 87 
K test, 87, 269 
Kurtosis, 85, 91 
coefficient of. See Coefficient of 
kurtosis 
test for, 91-93 


Lagrangian interpolation, three-point, 
to calculate power of an F test, 621 
Latin square design, 495, 497-507 
analysis of variance for, 498-500 
computational formulae and procedure 
for sums of squares in, 502 
interval estimation in, 500-501 
mathematical model of, 498 
missing observations in, 502—503 
multiple comparisons in, 501 
point estimation in, 500-501 
power of F test in, 501 
relative efficiency of, 503-504 
replications in, 504—505 
worked example in, 505-507 
Latin squares, 495, 497 
some more examples of, 601—602 
Lattice design, 519-520 
Least significant difference test, 77—78 
Least squares estimators 


Subject Index 


in two-way crossed classification with 
interaction, 203 
in two-way crossed classification 
without interaction, 139 
in two-way nested classification, 354 
Level of confidence, definition and 
interpretation of, 599 
Levene’s test for homogeneity of 
variances, 107, 566-567 
Liberal interval, defined, 600 
Liberal test, defined, 598 
Linear combination of means, defined, 65 
Linear (statistical) models, defined, 8 
Logarithmic transformation, 109, 111 
Lower confidence interval, defined, 600 
LSD (least significance difference), 
77-718 


Magic Latin squares, 522 
Main effects, in two-way crossed 
classification with interaction, 
180 
MANOVA procedure, in SPSS, 164, 252, 
253, 334, 381, 451, 551-555 
program and output, 254, 255, 334, 
382, 496, 506, 511 
Mathematical expectation, defined, 
586-587 
Maximum likelihood estimators 
in one-way classification, 28 
in two-way crossed classification with 
interaction, 203n, 206n 
in two-way crossed classification 
without interaction, 139n, 141 
in two-way nested classification, 356 
Maximum test for scale, for homogeneity 
of variances, 108 
MAXORDERS subcommand, in SPSS 
ANOVA procedure, 553 
Mean squares, defined, 2 
expectations of. See Expectations of 
mean squares 
expected, rules for finding, 591-595 
sampling distribution of. See Sampling 
distribution of mean squares 
Mean value, defined, 587 
Means 
linear combination of, defined, 65 
range of the set of, defined, 80 
MEANS procedure, in SPSS, 550 


Subject Index 


MEANS statement, in SAS GLM 
procedure, 52, 562, 565-566 
Method of expected subclass numbers, 222 
Method of unweighted means, 217-220 
Method of weighted-squares-of-means, 
220-222 
METHOD subcommand, in SPSS 
VARCOMP procedure, 557-558 
Minimum norm quadratic unbiased 
estimation (MINQUE), 224 
Minimum sample size per treatment 
group needed in one-way fixed 
effects design, table of, 634-635 
Minimum variance quadratic unbiased 
estimators (MIVQUE), 224 
Minimum variance unbiased (MVU) 
estimator, defined, 599 
MINQUE (minimum norm quadratic 
unbiased estimation), 224 
Missing observations or values 
in Latin square design, 502-503 
in split-plot design, 514, 516 
in two-way crossed classification 
without interaction, 145-148 
worked example for, in two-way 
crossed classification without 
interaction, 161-164 
MIVQUE (minimum variance quadratic 
unbiased estimators), 224 
Mixed-classification design, defined, 431 
Mixed effects analysis, in an unbalanced 
two-way crossed classification 
model, 226-227 
Mixed model (Model IID), 4, 5, 481 
alternate, 264-268 
effects of violations of assumptions of 
two-way crossed classification 
with interaction, 270 
expectations of mean squares in 
in two-way crossed classification 
with interaction, 190-193 
in two-way crossed classification 
without interaction, 133 
F tests in 
in three-way crossed classification, 
291-292 
in two-way crossed classification 
with interaction, 202-203 
in two-way crossed classification 
without interaction, 139 


733 


in various other designs or models. 
See under a specific design or 
model : 
interval estimation in 
in two-way crossed classification 
with interaction, 209-210 
in two-way crossed classification 
without interaction, 
143-144 
in two-way nested (hierarchical) 
classification, 359, 367 
in various other designs or models. 
See under a specific design or 
model 
point estimation in 
in two-way crossed classification 
with interaction, 206-207 
in two-way crossed classification 
without interaction, 141 
in two-way nested (hierarchical) 
classification, 356, 357 
in various other designs or models. 
See under a specific design or 
model 
power of F test in, in two-way crossed 
classification with interaction, 
229-230 | 
in various other designs or models. 
See under a specific design or 
model 
sampling distribution of mean 
Squares in 
in two-way crossed classification 
with interaction, 196-197 
in two-way crossed classification 
without interaction, 136-137 
Scheffé’s, 267-268, 285n 
worked examples for 
in partially nested classifications, 
444-448, 449 
in three-way crossed classification, 
328-333 
in three-way nested classification, 
417-420 
in two-way crossed classification 
with interaction, 247—252 
in two-way crossed classification 
without interaction, 157-161 
in two-way nested (hierarchical) 
classification, 378-380 


734 


Model I. See Fixed effects model 
(Model I) 
Model II. See Random effects model 
(Model IT) 
Model III. See Mixed model (Model III) 
MODEL keyword/statement, in SAS 
procedures, 524-525, 546-549 
Modified sequentially rejective 
Bonferroni (MSRB) test, 79 
MSRB (modified sequentially rejective 
Bonferroni) test, 79 
Multifactor layouts, 281 
Multiple comparisons 
in Latin square design, 501 
in one-way classification, 64-84 
Bonferroni’s test for. See 
Bonferroni’s test/method interval 
Dunn-Sidak test for, 79-80 
Dunnett’s test for, 81-82 
Dunn’s procedure of, 78-79 
least significant difference test for, 
77-718 
MSRB test for, 79 
Newman-Keul’s test for, 80-81 
Scheffé’s method of. See Scheffé’s 
method of multiple comparison 
SRB test for, 79 
Tukey’s method of. See Tukey’s 
method of multiple comparison 
Various other methods of, 82-84 
in three-way crossed classification, 
299-301 
in two-way crossed classification with 
interaction, 230-233 
in two-way crossed classification 
without interaction, 149-151 
in two-way nested (hierarchical) 
classification, 360-362 


use of statistical packages for, 561-565 


Multiple regression analysis, method 
based on, 222-223 
Multivariate response, analysis of 
variance (ANOVA) models with, 9 
Mutual orthogonality of contrasts, 66 
MVU (minimum variance unbiased) 
estimator, defined, 599 


Negative estimates of variance 
components, 28, 141, 206, 356 


Subject Index 


Nested classifications 
four-way, 403-406 
general g-way, 406-407 
partially. See Partially nested 
classifications 
three-way. See Three-way nested 
classification 
Nested design, completely or 
hierarchically, 348-349 
Nested-factorial design, defined, 431 
Nested finite population models, 
474—475 
two-way, 474-475 
Newman-Keul’s test, 80-81, 561-563, 
565 
Nonadditivity, sum of squares for, 262 
Tukey’s one degree of freedom test for, 
262-264, 503 
Noncentral chi-square distribution, 21, 
135, 195, 197, definition and 
properties of, 573-574 
Noncentral F distribution, 59, 60, 138, 
139, 148, 227, 229, 262, 
definition and properties of, 
575-576 
Noncentral ¢ distribution, 251, 618, 669, 
definition and properties of, 
574-575 
Noncentrality parameter, 21, 57-59, 135, 
136, 138, 195, 251, 298, 618, 669, 
defined, 573 
Nonnegative maximum likelihood 
estimators of variance 
components, 28n, 141n, 206n, 
356n 
Nonparametric analysis of variance, 9 
Normality 
assumption of, 20, 135, 193, 286, 351, 
399, 407 
chi-square goodness of-fit test for, 
89-90 
D’Agostino’s D test for. See 
D’ Agostino’s D test for normality 
effects of departures from assumption 
of, 85-86 
Shapiro-Francia’s test for, 94-96 
Shapiro-Wilk’s W test for. See 
Shapiro-Wilk’s W test for 
normality 


Subject Index 


Normality (cont.) 
tests for, 89-97 
use of statistical packages for tests of, 
567 
NPAR TESTS procedure, in SPSS, 567 
Null hypothesis, defined, 597 


O’Brien procedure for testing 
homogeneity of variances, 108 
Observations, physical, variation among, 
l 
One observation per cell, three-way 
classification with, 301-302, 303 
One-sided confidence interval, defined, 
600 
One-way classification, 11-123 
advantages and disadvantages of, 125 
assumptions of, 11-12 | 
computational formulae and procedure 
for sums of squares in, 35-36 
confidence intervals for variance 
components in, 31-34 
corrections for departures from 
assumptions of, 108-113 
effects of departures from assumptions 
underlying, 84-89 
F test equivalent to two-sample f¢ test 
in, 582-584 
F tests in, 22-25 
mathematical model of, 11 
point estimation in, 26-31 
power of F test 
in fixed effects model (Model I), 
57-60 
in random effects model (Model II), 
60-61 
statistical computing packages in, 52 
worked examples using, 52, 53-56 
tests for departures from assumptions 
of, 89-108 
One-way finite population model, 
461-462 
One-way fixed effects design, table of 
minimum sample size per 
treatment group needed in, 
634-635 
One-way random effects design, table of 
power and optimum number of 
levels in, 630-633 


735 


ONEWAY procedure, in SPSS, 52, 

550-552 
program and output, 53, 54, 489 

Operating characteristic curves for F 
tests in random effects model 
(Model IT), 61, 681-685 

Orthogonal contrasts, defined, 65 

Outliers, detecting, 97 


p-value, defined, 23, in hypothesis 
testing, 597 
Paired ¢ test, F test equivalent to, 
584-586 
Parameter, defined, 598 
Partially hierarchical models, 434 
Partially nested classifications, 431-460 
analysis of variance for, 433-436 
computational formulae and procedure 
for sums of squares in, 437 
degrees of freedom in, rule for finding, 
435 
four-factor, 438-439 
mathematical model of, 431-433 
statistical computing packages in, 448, 
worked example using, 450-451 
Partition of the total sum of squares 
in one-way classification, 14—15 
in three-way crossed classification, 
285-286 
in two-way crossed classification with 
interaction, 182-183 
in two-way crossed classification 
without interaction, 128-129 
Pearson-Hartley charts, 58, 59, 61, 148, 
251, 672-680 
Percentage points of the standard normal 
distribution, table of, 607 
Percentiles of the chi-square distribution, 
table of, 610-611 
Physical observations, variation among, 1 
Point estimation, general method of, 
598-599 
in fixed effects model (Model I) 
in two-way crossed classification 
with interaction, 203-205 
in two-way crossed classification 
without interaction, 139-141 
in two-way nested (hierarchical) 
classification, 354-355 


736 


Point estimation (cont. ) 
in Latin square design, 500-501 
in mixed model (Model III) 
in two-way crossed classification 
with interaction, 206—207 
in two-way crossed classification 
without interaction, 14] 
in two-way nested (hierarchical) 
classification, 356, 357 
in one-way classification, 26-31 
in random effects model (Model II) 
in two-way crossed classification 
with interaction, 205—206 
in two-way crossed classification 
without interaction, 141 
in two-way nested (hierarchical) 
classification, 355-356 
in three-way crossed classification, 
292-297 
in two-way crossed classification with 
interaction, 203-207 
in two-way crossed classification 
without interaction, 139-141 
in two-way crossed finite population 
model, 467-468 
in two-way nested (hierarchical) 
classification, 354-356, 365-366 
Population variances, multiple 
comparison for unequal, 84 
Populations, finite and infinite, 7-8 
POSTHOC subcommand, in SPSS 
ONEWAY and GLM procedures, 
563-564 
Power and optimum number of levels in 
one-way random effects F test, 
table of, 630-633 
Power function charts 
of F test in fixed effects model 
(Model I), 672-680 
of two-sided Student’s ¢ test, 
669-671 
Power of a test, defined, 597 
Power of F test 
in fixed effects model (Model I), table 
of, 621-629 
in Latin square design, 501 
in one-way classification, 52, 57-61 
in random effects model (Model II), 
table of optimum number of 
levels and, 630-633 


Subject Index 


in three-way crossed classification, 
298-299 
in two-way crossed classification with 
interaction, 227-230 
in fixed effects model (Model I), 
227-228 
in mixed model (Model III), 
229-230 
in random effects model (Model II), 
228-229 
in two-way crossed classification 
without interaction, 148-149 
in two-way nested (hierarchical) 
classification, 360 
Power of Student’s ¢ test, table of, 
618-620 
POWER subcommand, in SPSS 
MANOVA procedure, 560 
Power transformation, 112-113 
PRINT subcommand, in SPSS 
VARCOMP procedure, 558 
PROC ANOVA, in SAS, 52, 164, 252, 
253, 333, 334, 381, 448, 524, 
544-547, 549 
program and output, 53, 54, 165, 254, 
334, 335, 489, 496, 506, 511 
PROC GLM,, in SAS, 52, 164, 252, 253, 
333, 334, 381, 421, 448, 451, 524, 
544, 549 
program and output, 55, 56, 166, 167, 
255, 256, 258, 337, 382, 384, 385, 
421, 423, 424, 450, 518 
PROC LATTICE, in SAS, 520 
PROC MIXED, in SAS, 164, 252, 333, 
381, 448, 525, 545, 549 
program and output, 604 
worked examples using, 603-604 
PROC MULTTEST, in SAS, 562 
PROC NESTED, in SAS, 381, 448, 544, 
545, 547, 548 
program and output, 383 
PROC UNIVARIATE, in SAS, 52, 
567 
PROC VARCOMf, in SAS, 164, 252, 
333, 381, 448, 545, 549, 557 
Proportional frequencies in an 
unbalanced three-way crossed 
classification, 311-314 
unbalanced two-way crossed 
classification, 


Subject Index 


Proportional frequencies (cont. ) 
for fixed effects analysis, 215-217 
for random effects analysis, 223 
Protected least significant difference, 
77-78 
Pseudo-F test 
in higher order crossed classifications, 
308 
in partially nested classifications, 
442-443 
in three-way crossed classification, 
290-292, 293, 327 
in two-way crossed finite population 
model, 466-467, 477-479 
use of Satterthwaite procedure 
in constructing, 578-580 


Quasi-factorials, 519 


Random effects, 4, 5, concept of, 6-7 
Random effects analysis, in an 
unbalanced two-way crossed 
classification model, 223-226 
general case of unequal frequencies 
for, 223-226 
proportional frequencies for, 223 
Random effects model (Model ID, 4, 5, 
481 
effects of violations of assumptions of 
two-way crossed classification 
with interaction, 269 
expectations of mean squares in 
in two-way crossed classification 
with interaction, 188-190 
in two-way crossed classification 
without interaction, 132-133 
F tests in 
in one-way Classification, 24-25 
in three-way crossed classification, 
288-291] 
in two-way crossed classification 
with interaction, 200—202 
in two-way crossed classification 
without interaction, 139 
in various other designs or models. 
See under a specific design or 
model 
interval estimation in 
in two-way crossed classification 
with interaction, 208-209 


737 


in two-way crossed classification 
without interaction, 143 
in two-way nested (hierarchical) 
classification, 358 
in various other designs or models. 
See under a specific design or 
model 
operating characteristic curves for F 
tests in, charts of, 61, 681-685 
point estimation in 
in two-way crossed classification 
with interaction, 205-206 
in two-way crossed classification 
without interaction, 141 
in two-way nested (hierarchical) 
classification, 355-356 
in various other designs or models. 
See under a specific design or 
model 
power and optimum number of levels 
of F test in one-way, table of, 
630-633 
power of F test in 
in one-way classification, 60-61 
in two-way crossed classification 
with interaction, 228-229 
in various other designs or models. 
See under a specific design or 
model 
sampling distribution of mean 
squares in 
in two-way crossed classification 
with interaction, 196 
in two-way crossed classification 
without interaction, 136 
in various other designs or models. 
See under a specific design or 
model 
worked examples for 
in one-way classification, 43—52 
in partially nested classifications, 
439-443 
in three-way crossed classification, 
322-328 
in three-way nested classification, 
407-411 
in two-way crossed classification 
with interaction, 244-247 
in two-way crossed classification 
without interaction, 154-157 


738 


Random effects model (Model II) (cont.) 
in two-way nested (hierarchical) 
classification, 371-374 
for unequal numbers in subclasses 
in three-way nested classification, 
411-417 
in two-way nested (hierarchical) 
classification, 374-378 
RANDOM keyword/statement/ 
subcommand, in SAS and SPSS 
GLM procedures, 252, 333, 381, 
448, 525, 546, 547-548, 556 
Random numbers, table of, 665-668 
Random sample, defined, 595-596 
Randomization, in experimental design, 
483-484, 488, 493, 497 
Randomized block design (RBD), 488, 
490-495, 496 
analysis of variance for, 490-491 
generalized (GRBD), 494 
mathematical model of, 490 
missing observations in, 493 
relative efficiency of, 493-494 
replications in, 494 
worked example for, 494-495 
Randomized design, completely. See 
Completely randomized design 
Randomness, concept of, 596 
Range of the set of means, defined, 80 
RANGES subcommand, in SPSS 
ONEWAY procedure, 562-563 
RBD. See Randomized block design 
RE. See Relative efficiency 
Reciprocal transformation, 111-112 
Relative efficiency (RE), defined, 493n 


of randomized block design, 493-494 — 


of Latin square design, 503-504 
Reliability, of an estimate, 599 
REML, option for computing restricted 
maximum likelihood estimates in 
SAS PROC MIXED and SPSS 
VARCOMP procedures, 549, 557 
Repeated measures design, 521-522 
Replications, in experimental design, 
483 
in Latin square design, 504 
in randomized block design, 494 
Representative sample, concept of, 
595 


Subject Index 


Restricted maximum likelihood. See 
REML 

Row efficiency, of a Latin square design, 
504 


Sample size, power and determination of, 
61-64 
Sample size determination using smallest 
detectable difference, 63-64 
Sample, defined, 595 
Sampling distribution, defined, 596 
Sampling distribution of mean squares 
in fixed effects model (Model I) 
in two-way crossed classification 
with interaction, 193, 195-196 
in two-way crossed classification 
without interaction, 135-136 
in mixed model (Model II) 
in two-way crossed classification 
with interaction, 196-197 
in two-way crossed classification 
without interaction, 136—137 
in one-way classification, 20-22 
in random effects model (Model II) 
in two-way crossed classification 
with interaction, 196 
in two-way crossed classification 
without interaction, 136 
in two-way crossed classification with 
interaction, 193—197 
in two-way crossed classification 
without interaction, 135-137 
SAS (Statistical Analysis System), 
543-550. See also Statistical 
computing packages and entries 
following PROC 
SAS PROBMC function, 82 
Satterthwaite procedure, 144, 209, 210, 
365, 367, 403, 468, 578-580 
Scheffé’s mixed model, 267-268, 285n 
Scheffé’s method of multiple comparison 
in Latin square design, 501 
in one-way Classification, 73-76 
effects of departures from 
assumptions in, 86, 87 
interpretation of, 76-77 
relative merits and drawbacks of, 77 
in three-way crossed classification, 
300-301 


Subject Index 


Scheffé’s method (cont.) 
in two-way crossed classification with 
interaction, 230, 231, 232, 233, 
251 
in two-way crossed classification 
without interaction, 150, 
153-154, 160-161 
in two-way nested (hierarchical) 
classification, 361, 365 
using statistical computing packages, 
561-565 
Sequentially rejective Bonferroni (SRB) 
procedure, 79 
Shapiro-Francia’s test for normality, 
94-96 
Shapiro-Wilk’s W test for normality, 
93-94, 102-103 
table of coefficients of order statistics 
for, 656 
table of critical values of, 657 
Significance level, concept of, 597 
Skewness, 85, 90 
coefficient of. See Coefficient of 
skewness 
test for, 90-91 
Smallest detectable difference, sample 
size determination using, 
63-64 
Snedecor’s F distribution. See F 
distribution 
SPECIAL keyword, in SPSS GLM 
procedure, 564 
Split-plot design, 512-516 
analysis of variance for, 513-515 
mathematical model of, 513 
missing values in, 514, 516 
worked example for, 516, 517-519 
Split-split-plot design, 523 
SPSS, 543, 550-558. See also Statistical 
computing packages and entries 
following a particular procedure 
Square-root transformation, 109-110, 
111 
Square transformation, 112 
SRB (sequentially rejective Bonferroni) 
procedure, 79 
Standard deviation, defined, 587 
Standard error of an estimator, defined, 
599n 


739 


Standard normal distribution 
cumulative, table of, 605-606 
percentage points of, table of, 607 

Statistic, term, 596 

Statistical Analysis System. See SAS 

Statistical computing packages, 543 
analysis of variance using, 543-567 
in finite population models, 481 
in one-way Classification, 52 

worked examples using, 52, 53-56 
in partially nested classifications, 448 
worked example using, 450-45 1 
in three-way crossed classification, 
333-334 
worked examples using, 334-338 
in three-way nested classifications, 421 
worked examples using, 421—425 
in two-way crossed classification with 
interaction, 252 
worked examples using, 253, 
254-256, 257, 258, 259 
in two-way crossed classification 
without interaction, 164 
worked examples using, 164-167 
in two-way nested (hierarchical) 
classification, 381 
worked examples using, 381-386 
in various other designs or models. See 
under a specific design or model 
Statistical inference, methods of, 
596-601 

Statistical packages, use of 
for computing power, 560-561 
for performing multiple comparisons, 

561-565 
for performing tests of 

homoscedasticity, 565-567 

Statistical Product and Service Solutions, 
543 

Statistical tables and charts, 605-688. 
See also entries under a specific 
table or chart | 

Studentized augmented range distribution 

83, table of critical values of, 654 
Studentized critical range values, 80 
Studentized maximum modulus 

distribution, 84 

definition and properties of, 577-578 
table of critical values of, 652-653 


740 


Studentized range distribution, 71, 80 
definition and properties of, 576-577 
table of critical values of, 636—638 

Student’s ¢ distribution. See t distribution 

Student’s ¢ test. See t test 

Sum(s) of squares, defined, 2 
for Tukey’s test of nonadditivity, 262 
rules for calculating, 590-591 
Types I, I, II, and IV, 545 

Super magic Latin squares, 522 

Systemic effects, 4. See also entries 

under Fixed effects 


t distribution, 68, 74, 76, 618, 669, 
definition and properties of, 
569-570 

doubly noncentral. See Doubly 
noncentral ¢ distribution 

noncentral. See Noncentral t 
distribution 

table of critical values of, 608-609 

t test, 68, 69, 78, 252, 618, 669 

paired, F test equivalent to, 584-586 

power function charts of the two-sided, 
669-671 

two-sample, F test equivalent to, in 
one-way classification, 582-584 

TEST statement/option/subcommand, in 
SAS and SPSS GLM procedures, 
252, 333, 381, 448, 547-548, 
556-557 

Test statistic, defined, 597 

Three- and higher-order crossed 
classifications, 281-345, unequal 
sample sizes in, 311-314 

Three-way classification with one 
observation per cell, 301-302, 
303 

Three-way crossed classification, 
281-302, 311-345 

assumptions of, 284—285 
computational formulae and procedure 
for sums of squares in, 297—298 
expectations of mean squares in, 286 
F tests in, 286, 288-292 
interval estimation in, 292—297 
mathematical model of, 281-283 
multiple comparisons in, 299-301 
partition of the total sum of squares in, 
285-286 


Subject Index 


point estimation in, 292-297 
power of F test in, 298-299 
statistical computing packages in, 
333-334 
worked examples using, 334—338 
Three-way crossed finite population 
model, 470-472 
Three-way nested classification, 395-429 
analysis of variance of, 396—399 
mathematical model of, 395-396 
statistical computing packages in, 421 
worked examples using, 421—425 
tests of hypotheses and estimation in, 
399-400 
unequal numbers in subclasses in, 
400-403 
worked example for, 411-417 
Transformations, 109-113 
to correct lack of normality, 109-110 
to correct lack of homoscedasticity, 
110-113 
Treatments fixed 
blocks fixed and, 490-492 
blocks random and, 492-493 
Treatments random 
blocks fixed and, 493 
blocks random and, 492 
Tukey-Kramer intervals, 82 
Tukey-Kramer-Miller-Winer procedure, 
82 
Tukey’s method of multiple comparison 
in Latin square design, 501 
in one-way classification, 70—72 
effects of departures from 
assumptions in, 86 
interpretation of, 76-77 
relative merits and drawbacks of, 
77 
in three-way crossed classification, 
300-301 
in two-way crossed classification with 
interaction, 230, 231, 232, 233, 
250 
in two-way crossed classification 
without interaction, 150, 
152-153, 159-160 
in two-way nested (hierarchical) 
classification, 361 
using statistical computing packages, 
561-563, 565 


Subject Index 


Tukey’s one degree of freedom test for 
nonadditivity, 262—264, 503 
Two-sample ¢ test, F test equivalent to, in 
one-way classification, 582-584 
Two-sided confidence interval, defined, 
600 
Two-sided Student’s ¢ test, power 
function charts of, 669-671 
Two-stage nested design, 13 
Two-way crossed classification, defined, 
125-126 
Two-way crossed classification with 
interaction, 177—280 
assumptions of, 180-182 
best linear unbiased estimation 
(BLUE) in, 204 
computational formulae and procedure 
for sums of squares in, 210-212 
effects of violations of assumptions of, 
268-270 
expectations of mean squares in, 
184-193 
F tests in, 197-203 
interval estimation in, 207-210 
mathematical model of, 177-180 
multiple comparisons in, 230-233 
partition of the total sum of squares in, 
182-183 
point estimation in, 203-207 
power of F test in, 227-230 
sampling distribution of mean squares 
in, 193, 195-197 
statistical computing packages in, 252 
worked examples using, 253, 
254—256, 257, 258, 259 
with unequal sample sizes per cell, 
212-227 
worked example for, 237—244 
Two-way crossed classification without 
interaction, 125-175 
assumptions of, 127-128 
best linear unbiased estimates (BLUE) 
in, 140 
computational formulae and procedure 
for sums of squares in, 144-145 
effects of violations of assumptions of, 
168 
expectations of mean squares in, 
130-134 
F tests in, 137-139 


741 


interval estimation in, 142-144 
mathematical model of, 126 
missing observations in, 145-148 
multiple comparisons in, 149-150 
partition of the total sum of squares in, 
128-129 
point estimation in, 139-141 
power of F test in, 148-149 
sampling distribution of mean squares 
in, 135-137 
statistical computing packages in, 164 
worked examples using, 164—167 
Two-way crossed finite population 
model, 462-470 
F tests in, 466-467 
interval estimation in, 468-470 
point estimation in, 467—468 
Two-way nested (hierarchical) 
classification, 347-394 
analysis of variance of, 350-351 
assumptions of, 350 
computational formulae and procedure 
for sums of squares in, 359, 
362-363 
F tests in, 351, 353, 363, 365 
interval estimation in, 356-359, 
365-367 
mathematical modei of, 349-350 
multiple comparisons in, 360-362 
point estimation in, 354-356, 365-366 
power of F test in, 360 
Statistical computing packages in, 381 
worked examples using, 381-386 
unequal numbers in subclasses in, 
362-368 
worked example for, 374-378 
Two-way nested finite population model, 
474-475 
2? design, 523-524 
Type I error, defined, 597 
Type II error, 36, defined, 597 
Types I, II, II, and IV sums of squares, 
252, 545 


UMA (uniformly most accurate) interval, 
defined, 600 

UMAU (uniformly most accurate 
unbiased) interval, defined, 601 

UMVU estimator, of the ratio of two 
variance components, 29 


742 


Unbalanced finite population models, 
475 
Unbiased confidence interval, defined, 
600 
Unbiased estimator, definition and 
example of, 598-599 
Unbiasedness, 598, 600 
Uncertainty, of an estimate, 599 
Unequal numbers of observations 
in one-way classification, 36-39 
Unequal numbers in subclasses 
in three-way nested classification, 
400-403 
worked example for, 411-417 
in two-way nested (hierarchical) 
classification, 362-368 
worked example for, 374-378 
Unequal sample sizes and population 
variances 
in multiple comparisons, 82-84 
Unequal sample sizes per cell 
in two-way crossed classification with 
interaction, 212-227 
in three- and higher-order 
classifications, 311-314 
UNIANOVA procedure, in SPSS, 551, 
566 
Uniformly minimum variance unbiased 
estimator. See UMVU estimator 
Uniformly most accurate (UMA) 
interval, defined, 600 
Uniformly most accurate unbiased 
(UMAVU) interval, defined, 601 
Uniformly shortest length (USL) interval, 
defined, 601 
Univariate analysis of variance (ANOVA) 
models, defined, 8 
Unweighted means, method of, 217-220 
Upper confidence interval, defined, 600 


Subject Index 


USL (uniformly shortest length) interval, 
defined, 601 


VARCOMP procedure, in SPSS, 164, 
252, 333, 451, 557-558 
Variance(s), defined, 587 
error, degrees of freedom in one-way 
classification for, 125 
homogeneity of. See Homogeneity of 
variances 
unequal population, in multiple 
comparisons, 84 
Variance components, defined, 12 
literature on, 580 
confidence intervals for, in one-way 
classification, 31-34 
estimation of. See Point estimation and 
Interval estimation entries 
Variance components model. See 
Random effects model (Model IT), 
Mixed effects model (Model IID) 
VS keyword, in SPSS GLM and 
MANOVA procedures, 554-557 


W test. See Shapiro-Wilk’s W test for 
normality 

Weighted-squares-of-means analysis, 
220-222 

WELCH option, in SAS GLM MEANS 
statement, 566 

Within group sum of squares, 15, 35-36 

WITHIN keyword, in SPSS GLM and 
MANOVA procedures, 553-555 

Worked examples. See under a specific 
model or design 


Youden squares, 520 
Z distribution. See Fisher’s Z distribution 


